{"620823":{"#nid":"620823","#data":{"type":"news","title":"Understanding How Data Scientists Understand Machine\u00a0Learning Models","body":[{"value":"\u003Cp\u003EHow do data scientists read and understand machine learning model outputs? This is the question that a new design probe built by a team of researchers led by\u0026nbsp;\u003Ca href=\u0022https:\/\/www.cse.gatech.edu\/\u0022\u003ESchool of Computational Science and Engineering\u003C\/a\u003E\u0026nbsp;(CSE) Ph.D. student\u0026nbsp;\u003Cstrong\u003EFred Hohman\u0026nbsp;\u003C\/strong\u003Eaims to answer.\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026ldquo;Without good models and the right tools to interpret them, data scientists risk making decisions based on hidden biases, spurious correlations, and false generalizations. This has led to a rallying cry for model interpretability,\u0026rdquo; said Hohman.\u003C\/p\u003E\r\n\r\n\u003Cp\u003ETo address this issue, Hohman teamed up with\u0026nbsp;U.C. Berkeley Ph.D. candidate\u0026nbsp;\u003Cstrong\u003EAndrew Head\u0026nbsp;\u003C\/strong\u003Eand Microsoft researchers\u0026nbsp;\u003Cstrong\u003ERich Caruana\u003C\/strong\u003E,\u0026nbsp;\u003Cstrong\u003ERobert DeLine\u003C\/strong\u003E, and\u0026nbsp;\u003Cstrong\u003ESteven Drucker\u003C\/strong\u003E, to create\u0026nbsp;\u003Ca href=\u0022https:\/\/fredhohman.com\/papers\/gamut\u0022\u003EGamut\u003C\/a\u003E. Gamut is an interactive system designed to investigate how data scientists interpret models, and how interactive interfaces\u0026nbsp;can support data scientists in answering questions about model interpretability.\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026ldquo;Machine learning is doing all this amazing work nowadays like cancer prediction, predicting fire risks in buildings, and poverty prediction via satellite images. But there are many applications where demographic bias such as gender, age, or race, is learned from data,\u0026rdquo; continued Hohman.\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026ldquo;That brings us to Gamut, which focuses on an area of machine learning called interpretability, which is essentially trying to understand what a machine learning algorithm has actually learned so data scientists can trust its predictions.\u0026rdquo;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E[VIDEO::https:\/\/youtu.be\/R-amW_yNX6I::aVideoStyle]\u003C\/p\u003E\r\n\r\n\u003Cp\u003EThe system uses generalized additive models (GAMs), models that combine high accuracy with an inherently intelligible structure, and interactive data visualization, to display model results and predictions to ultimately study how data scientists use explainable interfaces for interpretability.\u003C\/p\u003E\r\n\r\n\u003Cp\u003ESurprisingly, while the term interpretability loosely describes a human understanding of some component of a model, no formal agreed upon definition has been reached about what component should be understood, according to Hohman. This is another reason why Gamut is a critical piece to solving the interpretability puzzle.\u003C\/p\u003E\r\n\r\n\u003Cp\u003ERather than aiming to define interpretability, Hohman says Gamut instead aims to\u0026nbsp;operationalize it, or\u0026nbsp;turn the\u0026nbsp;\u003Ca href=\u0022https:\/\/en.wikipedia.org\/wiki\/Fuzzy_concept\u0022 title=\u0022Fuzzy concept\u0022\u003Efuzzy concept\u003C\/a\u003E\u0026nbsp;of interpretability into something more easily usable and actionable.\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026ldquo;Since machine learning models are still being used despite their problems, the idea is that we can break interpretability down into a suite of techniques to help data scientists interpret models today. And, by collaborating with Microsoft, our human-centered approach using rich user interaction and data visualization can be informed and tested by professional data scientists who work with machine learning daily.\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026ldquo;Our investigation showed that interpretability is not a monolithic concept. Data scientists have different reasons to interpret models and tailor explanations for specific audiences, often balancing competing concerns of simplicity and completeness,\u0026rdquo; Hohman said.\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n","summary":null,"format":"limited_html"}],"field_subtitle":"","field_summary":"","field_summary_sentence":[{"value":"CSE Ph.D. Student Fred Hohman releases his latest software, Gamut, that aims to help data scientists understand machine learning outputs."}],"uid":"34540","created_gmt":"2019-04-23 17:48:52","changed_gmt":"2019-04-24 15:13:07","author":"Kristen Perez","boilerplate_text":"","field_publication":"","field_article_url":"","dateline":{"date":"2019-04-24T00:00:00-04:00","iso_date":"2019-04-24T00:00:00-04:00","tz":"America\/New_York"},"extras":[],"hg_media":{"620818":{"id":"620818","type":"image","title":"Gamut - Visualization Software","body":null,"created":"1556040405","gmt_created":"2019-04-23 17:26:45","changed":"1556040405","gmt_changed":"2019-04-23 17:26:45","alt":"Interacting with Gamut\u0027s multiple coordinated views together. (A) Selecting the OverallQual feature from the sorted Feature Sidebar displays its shape curve in the Shape Curve View. (B) Brushing over either explanation for Instance 550 or Instance 798 shows the contribution of the Ove","file":{"fid":"236429","name":"19-gamut-chi.png","image_path":"\/sites\/default\/files\/images\/19-gamut-chi.png","image_full_path":"http:\/\/www.tlwarc.hg.gatech.edu\/\/sites\/default\/files\/images\/19-gamut-chi.png","mime":"image\/png","size":275452,"path_740":"http:\/\/www.tlwarc.hg.gatech.edu\/sites\/default\/files\/styles\/740xx_scale\/public\/images\/19-gamut-chi.png?itok=k7g70m_n"}}},"media_ids":["620818"],"groups":[{"id":"47223","name":"College of Computing"},{"id":"431631","name":"OMS"},{"id":"50877","name":"School of Computational Science and Engineering"}],"categories":[{"id":"8862","name":"Student Research"}],"keywords":[{"id":"9167","name":"machine learning"},{"id":"7257","name":"visualization"},{"id":"167449","name":"software"}],"core_research_areas":[{"id":"39431","name":"Data Engineering and Science"}],"news_room_topics":[],"event_categories":[],"invited_audience":[],"affiliations":[],"classification":[],"areas_of_expertise":[],"news_and_recent_appearances":[],"phone":[],"contact":[{"value":"\u003Cp\u003EKristen Perez\u003C\/p\u003E\r\n\r\n\u003Cp\u003ECommunications Officer\u003C\/p\u003E\r\n","format":"limited_html"}],"email":["kristen.perez@cc.gatech.ed"],"slides":[],"orientation":[],"userdata":""}}}