{"645460":{"#nid":"645460","#data":{"type":"event","title":"ARC Colloquium:  Ashwin Pananjady (Georgia Tech)","body":[{"value":"\u003Cp align = \u0022center\u0022\u003E\u003Cstrong\u003EAlgorithms \u0026amp; Randomness Center (ARC) \u003C\/strong\u003E\u003C\/p\u003E\r\n\r\n\u003Cp align = \u0022center\u0022\u003E\u003Cstrong\u003EAshwin Pananjady (Georgia Tech)\u003C\/strong\u003E\u003C\/p\u003E\r\n\r\n\u003Cp align = \u0022center\u0022\u003E\u003Cstrong\u003EMonday, April 26, 2021\u003C\/strong\u003E\u003C\/p\u003E\r\n\r\n\u003Cp align = \u0022center\u0022\u003E\u003Cstrong\u003EVirtual via Bluejeans - 11:00 am\u003C\/strong\u003E\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003ETitle:\u0026nbsp; \u003C\/strong\u003EToward instance-optimal policy evaluation: From lower bounds to algorithms\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cstrong\u003EAbstract:\u0026nbsp; \u003C\/strong\u003EThe paradigm of reinforcement learning has now made inroads in a wide range of applied problem domains. This empirical research has revealed the limitations of our theoretical understanding: popular RL algorithms exhibit a variety of behavior across domains and problem instances, and existing theoretical bounds, which are generally based on worst-case assumptions, can often produce pessimistic predictions. An important theoretical goal is to develop instance-specific analyses that help to reveal what aspects of a given problem make it \u0026quot;easy\u0026quot; or \u0026quot;hard\u0026quot;, and allow distinctions to be drawn between ostensibly similar algorithms.\u0026nbsp;Taking an approach grounded in nonparametric statistics, we initiate a study of this question for the policy evaluation problem. We show via information-theoretic lower bounds that many popular variants of temporal difference (TD) learning algorithms *do not* exhibit the optimal instance-specific performance in the finite-sample regime. On the other hand, making careful modifications to these algorithms does result in automatic adaptation to the intrinsic difficulty of the problem. When there is function approximation involved, our bounds also characterize the instance-optimal tradeoff between \u0026quot;bias\u0026quot; and \u0026quot;variance\u0026quot; in solving\u0026nbsp;projected fixed-point equations, a general class of problems that includes policy evaluation as a special case.\u003C\/p\u003E\r\n\r\n\u003Cp\u003EThe talk is based on joint work with Koulik Khamaru, Wenlong Mou, Feng Ruan, Martin Wainwright, and Michael Jordan.\u0026nbsp;No prior knowledge of reinforcement learning will be assumed.\u003C\/p\u003E\r\n\r\n\u003Cp\u003E----------------------------------\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Ca href=\u0022https:\/\/www.isye.gatech.edu\/users\/ashwin-pananjady-martin\u0022\u003ESpeaker\u0026#39;s Webpage\u003C\/a\u003E\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Cem\u003EVideos of recent talks are available at: \u003C\/em\u003E\u003Ca href=\u0022http:\/\/arc.gatech.edu\/node\/121\u0022\u003Ehttp:\/\/arc.gatech.edu\/node\/121\u003C\/a\u003E\u003C\/p\u003E\r\n\r\n\u003Cp\u003E\u003Ca href=\u0022https:\/\/mailman.cc.gatech.edu\/mailman\/listinfo\/arc-colloq\u0022\u003EClick here to subscribe to the seminar email list: arc-colloq@Klauscc.gatech.edu \u003C\/a\u003E\u003C\/p\u003E\r\n","summary":null,"format":"limited_html"}],"field_subtitle":"","field_summary":"","field_summary_sentence":[{"value":"Toward instance-optimal policy evaluation: From lower bounds to algorithms - Virtual via Bluejeans at 11:00am"}],"uid":"27544","created_gmt":"2021-03-17 14:17:51","changed_gmt":"2021-04-21 12:00:42","author":"Francella Tonge","boilerplate_text":"","field_publication":"","field_article_url":"","field_event_time":{"event_time_start":"2021-04-26T12:00:00-04:00","event_time_end":"2021-04-26T13:00:00-04:00","event_time_end_last":"2021-04-26T13:00:00-04:00","gmt_time_start":"2021-04-26 16:00:00","gmt_time_end":"2021-04-26 17:00:00","gmt_time_end_last":"2021-04-26 17:00:00","rrule":null,"timezone":"America\/New_York"},"extras":[],"groups":[{"id":"70263","name":"ARC"}],"categories":[],"keywords":[],"core_research_areas":[],"news_room_topics":[],"event_categories":[{"id":"1795","name":"Seminar\/Lecture\/Colloquium"}],"invited_audience":[{"id":"78761","name":"Faculty\/Staff"},{"id":"177814","name":"Postdoc"},{"id":"174045","name":"Graduate students"},{"id":"78751","name":"Undergraduate students"}],"affiliations":[],"classification":[],"areas_of_expertise":[],"news_and_recent_appearances":[],"phone":[],"contact":[],"email":[],"slides":[],"orientation":[],"userdata":""}}}