<node id="641326">
  <nid>641326</nid>
  <type>event</type>
  <uid>
    <user id="34773"><![CDATA[34773]]></user>
  </uid>
  <created>1605538177</created>
  <changed>1605538177</changed>
  <title><![CDATA[Ph.D. Thesis Proposal - A Unified Framework for Finite-Sample Analysis of Reinforcement Learning Algorithms]]></title>
  <body><![CDATA[<p><strong>Student Name:</strong>&nbsp;Zaiwei Chen</p>

<p>Machine Learning Ph.D. Student</p>

<p><strong>Home School:&nbsp;</strong>Aerospace Engineering</p>

<p>Georgia Institute of Technology</p>

<h5><strong>Committee</strong></h5>

<p>1 Dr. John-Paul Clarke (Advisor, School of Industrial and Systems Engineering, School of Aerospace Engineering, Georgia Institute of Technology)</p>

<p>2 Dr. Siva Theja Maguluri (Co-advisor, School of Industrial and Systems Engineering, Georgia Institute of Technology)</p>

<p>3 Dr. Justin Romberg (School of Electrical and Computer Engineering, Georgia Institute of Technology)</p>

<p>4 Dr. Benjamin Van Roy, Department of Electrical Engineering, Department of Management Science &amp; Engineering, Stanford University) (external)</p>

<h5><strong>Abstract</strong></h5>

<p>Reinforcement Learning (RL) captures an important facet of machine learning going beyond prediction and regression: sequential decision making, and has had a great impact on various problems of practical interest. The goal of this proposed thesis is to provide theoretical performance guarantees of RL algorithms. Specifically, we develop a universal approach for establishing finite-sample convergence bounds of RL algorithms when using tabular representation and when using function approximation. To achieve that, we consider general stochastic approximation algorithms and study their convergence bounds using a novel Lyapunov approach. The results enable us to gain insight into the behavior of RL algorithms.</p>
]]></body>
  <field_summary_sentence>
    <item>
      <value><![CDATA[ML Ph.D. student Zaiwei Chen presents his thesis proposal.]]></value>
    </item>
  </field_summary_sentence>
  <field_summary>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_summary>
  <field_time>
    <item>
      <value><![CDATA[2020-11-16T16:30:00-05:00]]></value>
      <value2><![CDATA[2020-11-16T18:00:00-05:00]]></value2>
      <rrule><![CDATA[]]></rrule>
      <timezone><![CDATA[America/New_York]]></timezone>
    </item>
  </field_time>
  <field_fee>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_fee>
  <field_extras>
      </field_extras>
  <field_audience>
          <item>
        <value><![CDATA[Faculty/Staff]]></value>
      </item>
          <item>
        <value><![CDATA[Postdoc]]></value>
      </item>
          <item>
        <value><![CDATA[Public]]></value>
      </item>
          <item>
        <value><![CDATA[Graduate students]]></value>
      </item>
          <item>
        <value><![CDATA[Undergraduate students]]></value>
      </item>
      </field_audience>
  <field_media>
      </field_media>
  <field_contact>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_contact>
  <field_location>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_location>
  <field_sidebar>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_sidebar>
  <field_phone>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_phone>
  <field_url>
    <item>
      <url><![CDATA[]]></url>
      <title><![CDATA[]]></title>
            <attributes><![CDATA[]]></attributes>
    </item>
  </field_url>
  <field_email>
    <item>
      <email><![CDATA[]]></email>
    </item>
  </field_email>
  <field_boilerplate>
    <item>
      <nid><![CDATA[]]></nid>
    </item>
  </field_boilerplate>
  <links_related>
      </links_related>
  <files>
      </files>
  <og_groups>
          <item>1299</item>
          <item>576481</item>
      </og_groups>
  <og_groups_both>
          <item><![CDATA[GVU Center]]></item>
          <item><![CDATA[ML@GT]]></item>
      </og_groups_both>
  <field_categories>
      </field_categories>
  <field_keywords>
      </field_keywords>
  <userdata><![CDATA[]]></userdata>
</node>
