{"627269":{"#nid":"627269","#data":{"type":"news","title":"First Topic Model Paper Wins GEM of PODS","body":[{"value":"\u003Cp\u003EUnsupervised learning relies on finding relevant information in large databases. This is possible thanks in part to groundbreaking research by School of Computer Science Professor \u003Ca href=\u0022https:\/\/www.cc.gatech.edu\/~vempala\/\u0022\u003E\u003Cstrong\u003ESantosh Vempala\u003C\/strong\u003E\u003C\/a\u003E and his collaborators 20 years ago.\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;The team won a test-of-time award for their impactful work on information retrieval. The annual conference Principles of Database Systems (PODS) gave their paper, \u003Ca href=\u0022https:\/\/www.google.com\/url?sa=t\u0026amp;rct=j\u0026amp;q=\u0026amp;esrc=s\u0026amp;source=web\u0026amp;cd=4\u0026amp;ved=2ahUKEwjzmPWL1enkAhUSMawKHQwdD-gQFjADegQIBRAB\u0026amp;url=https%3A%2F%2Fcourses.cs.washington.edu%2Fcourses%2Fcse522%2F05au%2Fpapadimitriou_LSI.pdf\u0026amp;usg=AOvVaw25jNkt6sbmSZh9-M2mrFDS\u0022\u003E\u003Cem\u003ELatent Semantic Indexing: A Probabilistic Analysis\u003C\/em\u003E\u003C\/a\u003E\u003Cem\u003E,\u003C\/em\u003E its \u003Ca href=\u0022https:\/\/databasetheory.org\/gems\u0022\u003EGems of PODS\u003C\/a\u003E honor at the this year\u0026rsquo;s conference in May.\u003C\/p\u003E\r\n\r\n\u003Cp\u003EThe 1998 paper analyzed a popular spectral algorithm and introduced the very first topic model, now a standard in unsupervised learning.\u003C\/p\u003E\r\n\r\n\u003Cp\u003EThe researchers discovered that if a database or corpus is viewed as a matrix, a computer algorithm can perform singular-value decomposition, a matrix reduction technique that pulls out the most significant directions to explain the data. This step not only involves minimal distortion of data but it actually yields better retrieval results than the full original matrix.\u003C\/p\u003E\r\n\r\n\u003Cp\u003ETheir topic model was able to identify the original underlying topics. The model and guarantees have been significantly enhanced in the decades since the paper was published.\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026ldquo;This was one of the first provable techniques for automatically extracting information from data,\u0026rdquo; Vempala said.\u003C\/p\u003E\r\n\r\n\u003Cp\u003ETheir work has influenced prominent computing fields such as spectral methods, data mining, machine learning, and deep neural networks.\u003C\/p\u003E\r\n\r\n\u003Cp\u003EVempala wrote the paper when he was a summer intern at IBM with \u003Cstrong\u003EPrabhakar Raghavan\u003C\/strong\u003E (now VP of Engineering at Google), together with Columbia University Professor \u003Cstrong\u003EChristos Papadimitriou \u003C\/strong\u003E(then at Berkeley), and Meiji University Professor \u003Cstrong\u003EHisao Tamaki\u003C\/strong\u003E. Papadimitriou gave the award talk.\u003C\/p\u003E\r\n","summary":null,"format":"limited_html"}],"field_subtitle":"","field_summary":"","field_summary_sentence":[{"value":" The team won a test-of-time award for their impactful work on information retrieval. "}],"uid":"34541","created_gmt":"2019-10-07 18:19:24","changed_gmt":"2019-10-11 15:10:01","author":"Tess Malone","boilerplate_text":"","field_publication":"","field_article_url":"","dateline":{"date":"2019-10-07T00:00:00-04:00","iso_date":"2019-10-07T00:00:00-04:00","tz":"America\/New_York"},"extras":[],"hg_media":{"350051":{"id":"350051","type":"image","title":"Santosh Vempala compressed","body":null,"created":"1449245702","gmt_created":"2015-12-04 16:15:02","changed":"1475895075","gmt_changed":"2016-10-08 02:51:15","alt":"Santosh Vempala compressed","file":{"fid":"201072","name":"santosh-vempala_0.jpg","image_path":"\/sites\/default\/files\/images\/santosh-vempala_0_0.jpg","image_full_path":"http:\/\/www.tlwarc.hg.gatech.edu\/\/sites\/default\/files\/images\/santosh-vempala_0_0.jpg","mime":"image\/jpeg","size":12220,"path_740":"http:\/\/www.tlwarc.hg.gatech.edu\/sites\/default\/files\/styles\/740xx_scale\/public\/images\/santosh-vempala_0_0.jpg?itok=FvLZpAvv"}}},"media_ids":["350051"],"groups":[{"id":"50875","name":"School of Computer Science"},{"id":"47223","name":"College of Computing"}],"categories":[],"keywords":[],"core_research_areas":[],"news_room_topics":[],"event_categories":[],"invited_audience":[],"affiliations":[],"classification":[],"areas_of_expertise":[],"news_and_recent_appearances":[],"phone":[],"contact":[{"value":"\u003Cp\u003ETess Malone, Communications Officer\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Ca href=\u0022mailto:tess.malone@cc.gatech.edu\u0022\u003Etess.malone@cc.gatech.edu\u003C\/a\u003E\u003C\/p\u003E\r\n","format":"limited_html"}],"email":["tess.malone@cc.gatech.edu"],"slides":[],"orientation":[],"userdata":""}}}