{"648023":{"#nid":"648023","#data":{"type":"news","title":"Blending Old and New Schools: Machine Learning Mixes with Traditional Science Principles ","body":[{"value":"\u003Cp\u003EMachine learning came along at just the right time. The world is now awash in more data than ever before, and computer algorithms that can learn and improve as they perform data analysis promise to help scientists handle that information overload.\u003C\/p\u003E\r\n\r\n\u003Cp\u003EYet researchers who think that machine learning by itself can help solve complex problems in science, engineering, and medicine, should strive for a more balanced approach, says\u0026nbsp;\u003Ca href=\u0022https:\/\/physics.gatech.edu\/user\/roman-grigoriev\u0022\u003ERoman Grigoriev\u003C\/a\u003E, part of a\u0026nbsp;\u003Ca href=\u0022https:\/\/physics.gatech.edu\/\u0022\u003ESchool of Physics\u003C\/a\u003E\u0026nbsp;team with new\u0026nbsp;\u003Ca href=\u0022https:\/\/www.nature.com\/articles\/s41467-021-23479-0\u0022\u003Eresearch\u003C\/a\u003E\u0026nbsp;suggesting a hybrid approach for conducting science that blends new era technologies, old school experimentation, and theoretical analysis. The research suggests faster solutions to complex, data-intensive riddles involving such issues as cancer, earthquakes, weather forecasts, and climate change.\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026ldquo;It\u0026rsquo;s a combination of existing theoretical understanding \u0026mdash; as well as experimental data with machine learning,\u0026rdquo; says Grigoriev, Physics professor and lead investigator of the\u0026nbsp;\u003Ca href=\u0022http:\/\/cns.physics.gatech.edu\/~roman\/index.html\u0022\u003EDynamics and Control Group\u003C\/a\u003E. \u0026ldquo;Oftentimes people who do machine learning kind of forget about theoretical understanding and almost rely totally on data. It\u0026rsquo;s relatively simple, but when there\u0026rsquo;s a lot of data and not enough structure in that data, that approach is bound to fail.\u0026quot; Grigoriev explains that there\u0026#39;s often just too much data to meaningfully analyze, at which point \u0026quot;the problem becomes intractable. Essentially, harnessing appropriate domain knowledge is critical for finding structure in the data.\u0026rdquo;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Ca href=\u0022https:\/\/www.nature.com\/articles\/s41467-021-23479-0\u0022\u003E\u0026ldquo;Robust learning from noisy, incomplete, high-dimensional experimental data via physically constrained symbolic regression\u003C\/a\u003E,\u0026rdquo; was in May in\u0026nbsp;\u003Ca href=\u0022https:\/\/www.nature.com\/ncomms\/\u0022\u003E\u003Cem\u003ENature Communications\u003C\/em\u003E.\u003C\/a\u003E Fellow School of Physics researchers involved in the study are\u0026nbsp;\u003Ca href=\u0022https:\/\/physics.gatech.edu\/user\/michael-schatz\u0022\u003EMichael Schatz\u003C\/a\u003E, professor and the School\u0026rsquo;s interim chair; graduate research assistant\u0026nbsp;\u003Ca href=\u0022https:\/\/physics.gatech.edu\/user\/logan-kageorge\u0022\u003ELogan Kageorge\u003C\/a\u003E; and former graduate research assistant\u0026nbsp;\u003Ca href=\u0022https:\/\/www.linkedin.com\/in\/patrick-reinbold-05aa78136\u0022\u003EPatrick A.K. Reinbold\u003C\/a\u003E.\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003EThe problem with high-dimensional data\u003C\/strong\u003E\u003C\/p\u003E\r\n\r\n\u003Cp\u003EMachine learning uses computer algorithms to find patterns in data, but \u0026ldquo;most popular machine learning approaches present results in a form that is hard to interpret and\u0026nbsp;explain,\u0026rdquo; Grigoriev says. \u0026ldquo;Unless you understand the how and the why, you can\u0026rsquo;t really say you understand a problem.\u0026rdquo;\u003C\/p\u003E\r\n\r\n\u003Cp\u003EUnderstanding and predicting complicated behaviorsbehaviors \u0026mdash; by crunching a lot of dense, rich data \u0026mdash; can help with fundamental and practical problems in science arenas like weather forecasting and characterizing cardiac arrhythmias.\u0026nbsp;The problem is\u0026nbsp;that most of those arenas involve \u0026ldquo;high-dimensional\u0026rdquo; data, which means exactly what it sounds like: data with a lot of dimensions or variables, sometimes millions of them.\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003EThe dimensionality of the data is so large that \u0026ldquo;you get lost and it\u0026rsquo;s hard to see any trends,\u0026rdquo; Grigoriev says.\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003EHis team has come up with a hybrid approach that blends machine learning with elements of the traditional process of scientific discovery. That means a theoretical description, observations, designing experiments to test the description, and \u0026ldquo;then going back and forth between improving the theories, and designing new experiments. That\u0026rsquo;s been the traditional approach for hundreds of years.\u0026rdquo;\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003EThe foundation of Science\u0026#39;s understanding and progress relies on that scientific method \u0026mdash; the combination of theory and experimentation. \u0026ldquo;They\u0026rsquo;re not developed just based on the\u0026nbsp;data. They\u0026nbsp;are developed using both existing knowledge as well as some general fundamental laws.\u0026rdquo;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003EAn approach that spotlights the beauty of equations\u003C\/strong\u003E\u003C\/p\u003E\r\n\r\n\u003Cp\u003EConstraining the data to include just those variables that pertain directly to the experiment in\u0026nbsp;question is vital in working with high-dimensional data, Grigoriev says.\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026ldquo;What this approach allows you to do is identify a simpler model that uses the variables you need. It\u0026rsquo;s a simplified description that applies to a particular situation, but obtained using data that\u0026rsquo;s computational or experimental. It can do both.\u0026rdquo;\u003C\/p\u003E\r\n\r\n\u003Cp\u003EThe result is represented in a mathematical model, Grigoriev says, and \u0026ldquo;once you see those equations, you understand what the variables are. The equations certainly help explain the essence of a physical problem.\u0026rdquo; His team\u0026rsquo;s approach was validated in the research with a fluid dynamics experiment. A thin layer of liquid was suspended in a rectangular tank, with magnetic and electrical fields shot through it to create what physicists call a turbulent flow \u0026mdash; irregular shifts happening within the fluid layer that can rapidly change direction and magnitude.\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003EGrigoriev and his team used their hybrid approach to analyze the accessible data, in this case the velocity of the water. Subsequently, they were able to reconstruct variables that couldn\u0026rsquo;t be measured directly, like water pressure and force.\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003EThis is the beauty of the equations \u0026mdash; how much they allow you to do,\u0026nbsp;Grigoriev says.\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026ldquo;What we do get is an equation, or set of equations, which are in a familiar form. We know how to explain, how to solve the problem using these equations. This is the nice thing about this approach. We\u0026rsquo;re working with variables whose meaning we understand; we know how to interpret them.\u0026rdquo;\u003C\/p\u003E\r\n\r\n\u003Cp\u003EThe team believes the study\u0026rsquo;s results will lead to advances like faster, more accurate ways to make predictions of complicated behavior in those large, real world problems in science, engineering, and medicine.\u0026nbsp;For example, as Grigoriev\u0026rsquo;s team\u0026rsquo;s research states, \u0026ldquo;the ability to identify and quantify important patterns and sequences in atmospheric turbulence should enable weather forecasts that are better and more rapid than those currently possible today.\u0026rdquo;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cem\u003EThis material is based upon work supported by the National Science Foundation under Grants No. CMMI-1725587 and CMMI-2028454. The experimental data used in this work was produced by Jeff Tithof. The magnetic field measurements were performed with assistance from Charles Haynes. \u003Ca href=\u0022https:\/\/doi.org\/10.1038\/s41467-021-23479-0\u0022\u003Ehttps:\/\/doi.org\/10.1038\/s41467-021-23479-0\u003C\/a\u003E\u003C\/em\u003E\u003C\/p\u003E\r\n","summary":null,"format":"limited_html"}],"field_subtitle":[{"value":"Solving big science problems with new roadmap that blends potent data analysis tools with \u0027existing theoretical understanding\u0027"}],"field_summary":[{"value":"\u003Cp\u003EA quartet of School of Physics researchers has come up with a roadmap that blends powerful new machine learning tools with traditional scientific investigative techniques and theories, all in the hopes of solving the biggest problems in science, engineering, and medicine.\u003C\/p\u003E\r\n","format":"limited_html"}],"field_summary_sentence":[{"value":"Solving big science problems with new roadmap that blends potent data analysis tools with \u0027existing theoretical understanding\u0027"}],"uid":"34434","created_gmt":"2021-06-09 19:48:03","changed_gmt":"2021-06-24 19:02:50","author":"Renay San Miguel","boilerplate_text":"","field_publication":"","field_article_url":"","dateline":{"date":"2021-06-16T00:00:00-04:00","iso_date":"2021-06-16T00:00:00-04:00","tz":"America\/New_York"},"extras":[],"hg_media":{"648176":{"id":"648176","type":"image","title":"A fluid dynamics experiment shows small fluorescent particles carried along by the flow. The particles represent the types of data used in the School of Physics study. (Credit: Roman Grigoriev)","body":null,"created":"1623870072","gmt_created":"2021-06-16 19:01:12","changed":"1623870072","gmt_changed":"2021-06-16 19:01:12","alt":"","file":{"fid":"246059","name":"2021 06 Roman Grigoriev - research - Crisp Seeded Flow.jpg","image_path":"\/sites\/default\/files\/images\/2021%2006%20Roman%20Grigoriev%20-%20research%20-%20Crisp%20Seeded%20Flow.jpg","image_full_path":"http:\/\/www.tlwarc.hg.gatech.edu\/\/sites\/default\/files\/images\/2021%2006%20Roman%20Grigoriev%20-%20research%20-%20Crisp%20Seeded%20Flow.jpg","mime":"image\/jpeg","size":1014405,"path_740":"http:\/\/www.tlwarc.hg.gatech.edu\/sites\/default\/files\/styles\/740xx_scale\/public\/images\/2021%2006%20Roman%20Grigoriev%20-%20research%20-%20Crisp%20Seeded%20Flow.jpg?itok=MCgpz90p"}},"648024":{"id":"648024","type":"image","title":"Roman Grigoriev","body":null,"created":"1623268247","gmt_created":"2021-06-09 19:50:47","changed":"1623268247","gmt_changed":"2021-06-09 19:50:47","alt":"","file":{"fid":"245990","name":"RG5.jpg","image_path":"\/sites\/default\/files\/images\/RG5.jpg","image_full_path":"http:\/\/www.tlwarc.hg.gatech.edu\/\/sites\/default\/files\/images\/RG5.jpg","mime":"image\/jpeg","size":238786,"path_740":"http:\/\/www.tlwarc.hg.gatech.edu\/sites\/default\/files\/styles\/740xx_scale\/public\/images\/RG5.jpg?itok=NBWAgycv"}}},"media_ids":["648176","648024"],"related_links":[{"url":"https:\/\/cos.gatech.edu\/news\/12-proposals-achieve-college-sciences-strategic-goals-funded-sutherland-deans-chair","title":"12 Proposals to Achieve College of Sciences Strategic Goals Funded by Sutherland Dean\u0027s Chair"},{"url":"https:\/\/cos.gatech.edu\/news\/open-source-machine-learning-tool-could-help-choose-cancer-drugs","title":"Open Source Machine Learning Tool Could Help Choose Cancer Drugs"}],"groups":[{"id":"1278","name":"College of Sciences"},{"id":"126011","name":"School of Physics"},{"id":"1188","name":"Research Horizons"}],"categories":[{"id":"150","name":"Physics and Physical Sciences"}],"keywords":[{"id":"4896","name":"College of Sciences"},{"id":"166937","name":"School of Physics"},{"id":"187915","name":"go-researchnews"},{"id":"170035","name":"Roman Grigoriev"},{"id":"40211","name":"Michael Schatz"},{"id":"188026","name":"Logan Kageorge"},{"id":"188027","name":"Patrick A.K. Reinbold"},{"id":"9167","name":"machine learning"},{"id":"188028","name":"scientific observations"},{"id":"188029","name":"mathematical equations"}],"core_research_areas":[{"id":"39431","name":"Data Engineering and Science"},{"id":"39501","name":"People and Technology"}],"news_room_topics":[],"event_categories":[],"invited_audience":[],"affiliations":[],"classification":[],"areas_of_expertise":[],"news_and_recent_appearances":[],"phone":[],"contact":[{"value":"\u003Cp\u003ERenay San Miguel\u003Cbr \/\u003E\r\nCommunications Officer II\/Science Writer\u003Cbr \/\u003E\r\nCollege of Sciences\u003Cbr \/\u003E\r\n404-894-5209\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n","format":"limited_html"}],"email":["renay.san@cos.gatech.edu"],"slides":[],"orientation":[],"userdata":""}}}