{"622219":{"#nid":"622219","#data":{"type":"news","title":"New Machine Learning Algorithms Keep Group Data Diverse","body":[{"value":"\u003Cp\u003EGeorgia Tech researchers have created machine learning (ML) algorithms to ensure grouped data is fairly represented.\u003C\/p\u003E\r\n\r\n\u003Cp\u003EThis is the first example of incorporating fairness into the popular spectral clustering technique for partitioning graph data, according to researchers. When evaluated on social networks like Facebook, their algorithms improve the groups diversity by 10 to 34 percent on average.\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003EPromoting fairness\u003C\/strong\u003E\u003C\/p\u003E\r\n\r\n\u003Cp\u003EML can automate complex social and financial processes, like lending, education, and marketing. Yet for all its innovation, potential for bias arises as many datasets have disproportionate examples of one demographic.\u003C\/p\u003E\r\n\r\n\u003Cp\u003EThe challenge of keeping ML fair becomes only more complicated with grouped data, or clusters. Social networks, for example, rely on large graphs of data that connect various people to each other. Enough of these connections can indicate a community \u0026mdash;\u0026nbsp;valuable data for advertisers and other stakeholders.\u003C\/p\u003E\r\n\r\n\u003Cp\u003ESpectral clustering is a common ML technique to help find these communities. With the new emphasis on fairness in ML, though, ensuring these communities are diverse is becoming more important.\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026ldquo;Obviously, you want to figure out who the communities are, but you also want them to be diverse,\u0026rdquo; said School of Computer Science (SCS) Ph.D. student \u003Ca href=\u0022http:\/\/www.samirasamadi.com\u0022\u003E\u003Cstrong\u003ESamira Samadi\u003C\/strong\u003E\u003C\/a\u003E.\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003EKeeping the proportions\u003C\/strong\u003E\u003C\/p\u003E\r\n\r\n\u003Cp\u003ESamadi and her team characterize diversity as each demographic group having proportional representation in the clusters with the same proportions as in the entire dataset. To do this, they designed clustering algorithms that find fairer clustering when available in the data.\u003C\/p\u003E\r\n\r\n\u003Cp\u003EThe researchers tested their algorithms on a natural variant of the stochastic block model, a famous random graph model used to study the performance of clustering algorithms. On this model, they proved the efficacy of their algorithms to recover the fairer clustering in the data with high probability.\u003C\/p\u003E\r\n\r\n\u003Cp\u003EYet the algorithms are not just theoretical. The researchers also tested them on empirical datasets and proved that the algorithms can lead to more proportional clusters with minimal damage to the interconnectivity of the groups, or\u0026nbsp; the quality of clusters.\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026ldquo;Designing fair clustering algorithms helps ML to draw a more diverse image of communities in a network,\u0026rdquo; Samadi said. \u0026ldquo;This not only leads to less representational bias toward specific demographics, but could also help marketers to maximize their full potential customer base.\u0026rdquo;\u003C\/p\u003E\r\n\r\n\u003Cp\u003EThe researchers presented their work in the paper, \u003Ca href=\u0022https:\/\/arxiv.org\/pdf\/1902.11281.pdf\u0022\u003E\u003Cem\u003EGuarantees for Spectral Clustering with Fairness Constraints\u003C\/em\u003E\u003C\/a\u003E, at the \u003Ca href=\u0022https:\/\/icml.cc\/\u0022\u003EInternational Conference on Machine Learning\u003C\/a\u003E (ICML) in Long Beach, California, from June 9 to 15. Samadi co-wrote the paper with SCS Assistant Professor \u003Ca href=\u0022http:\/\/jamiemorgenstern.com\/\u0022\u003E\u003Cstrong\u003EJamie Morgenstern\u003C\/strong\u003E\u003C\/a\u003E, Rutgers postdoctoral researcher \u003Cstrong\u003EMatth\u0026auml;us Kleindessner,\u003C\/strong\u003E and Rutgers Assistant Professor \u003Cstrong\u003EPranjal Awasthi\u003C\/strong\u003E.\u003C\/p\u003E\r\n","summary":null,"format":"limited_html"}],"field_subtitle":"","field_summary":"","field_summary_sentence":[{"value":"Georgia Tech researchers have created machine learning (ML) algorithms to ensure grouped data is fairly represented."}],"uid":"34541","created_gmt":"2019-06-04 16:19:07","changed_gmt":"2019-06-06 13:34:30","author":"Tess Malone","boilerplate_text":"","field_publication":"","field_article_url":"","dateline":{"date":"2019-06-04T00:00:00-04:00","iso_date":"2019-06-04T00:00:00-04:00","tz":"America\/New_York"},"extras":[],"hg_media":{"622220":{"id":"622220","type":"image","title":"Fair Clustering","body":null,"created":"1559665512","gmt_created":"2019-06-04 16:25:12","changed":"1559665512","gmt_changed":"2019-06-04 16:25:12","alt":"How fair clutering works","file":{"fid":"237002","name":"clusteringFair.png","image_path":"\/sites\/default\/files\/images\/clusteringFair.png","image_full_path":"http:\/\/www.tlwarc.hg.gatech.edu\/\/sites\/default\/files\/images\/clusteringFair.png","mime":"image\/png","size":128546,"path_740":"http:\/\/www.tlwarc.hg.gatech.edu\/sites\/default\/files\/styles\/740xx_scale\/public\/images\/clusteringFair.png?itok=GliXOmwZ"}}},"media_ids":["622220"],"groups":[{"id":"47223","name":"College of Computing"},{"id":"50875","name":"School of Computer Science"},{"id":"576481","name":"ML@GT"}],"categories":[],"keywords":[],"core_research_areas":[],"news_room_topics":[],"event_categories":[],"invited_audience":[],"affiliations":[],"classification":[],"areas_of_expertise":[],"news_and_recent_appearances":[],"phone":[],"contact":[{"value":"\u003Cp\u003ETess Malone, Communications Officer\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Ca href=\u0022mailto:tess.malone@cc.gatech.edu\u0022\u003Etess.malone@cc.gatech.edu\u003C\/a\u003E\u003C\/p\u003E\r\n","format":"limited_html"}],"email":["tess.malone@cc.gatech.edu"],"slides":[],"orientation":[],"userdata":""}}}