<node id="643579">
  <nid>643579</nid>
  <type>event</type>
  <uid>
    <user id="34540"><![CDATA[34540]]></user>
  </uid>
  <created>1611861517</created>
  <changed>1611861530</changed>
  <title><![CDATA[CSE Seminar with University of California, Berkeley Department of Statistics Post Doc Michał Dereziński ]]></title>
  <body><![CDATA[<p><strong>Name:</strong>&nbsp;Michał Dereziński</p>

<p><strong>Date/Time:&nbsp;</strong>Tuesday, February 9 @ 11:00 am</p>

<p><strong>Link:&nbsp;</strong><a href="https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbluejeans.com%2F6622130444&amp;data=04%7C01%7Ckristen.perez%40cc.gatech.edu%7C0f21edb0c65040f4dd4708d8c3b6b65b%7C482198bbae7b4b258b7a6d7f32faa083%7C0%7C0%7C637474536796501450%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=hJ8d%2FXhZOmvIkQZADh3FDnxmXDCqmNClep6ECNNhx04%3D&amp;reserved=0" title="//bluejeans.com/6622130444

Click to follow link.">https://bluejeans.com/6622130444</a></p>

<p><strong>Title:</strong>&nbsp;Bridging algorithmic and statistical randomness in machine learning<br />
<br />
<strong>Abstract:</strong><br />
Randomness is a key resource in designing efficient algorithms, and it is also a fundamental modeling framework in statistics and machine learning. Methods that lie at&nbsp;the intersection of algorithmic and statistical randomness are at the forefront of modern data science. In this talk, I will discuss how statistical assumptions affect the bias-variance trade-offs and performance characteristics of randomized algorithms for, among others, linear regression, stochastic optimization, and dimensionality reduction. I&nbsp;will also present an efficient algorithmic framework, called joint sampling, which is used to both predict and improve the statistical performance of machine learning&nbsp;methods, by injecting carefully chosen correlations into randomized algorithms.<br />
<br />
In the first part of the talk, I will focus on the phenomenon of inversion bias, which is a systematic bias caused by inverting random matrices. Inversion bias is a significant&nbsp;bottleneck in parallel and distributed approaches to linear regression, second order optimization, and a range of statistical estimation tasks. Here, I will introduce a joint&nbsp;sampling technique called Volume Sampling, which is the first method to eliminate inversion bias in model averaging. In the second part, I will demonstrate how the&nbsp;spectral properties of data distributions determine the statistical performance of machine learning algorithms, going beyond worst-case analysis and revealing new phase&nbsp;transitions in statistical learning. Along the way, I will highlight a class of joint sampling methods called Determinantal Point Processes (DPPs), popularized in machine&nbsp;learning over the past fifteen years as a tractable model of diversity. In particular, I will present a new algorithmic technique called Distortion-Free Intermediate Sampling,&nbsp;which drastically reduced the computational cost of DPPs, turning them into a practical tool for large-scale data science.&nbsp;<br />
<br />
<strong>Bio:</strong><br />
Michał Dereziński is a postdoctoral fellow in the Department of Statistics at the University of California, Berkeley. Previously, he was a research fellow at the Simons&nbsp;Institute for the Theory of Computing (Fall 2018, Foundations of Data Science program). He obtained his Ph.D. in Computer Science at the University of California, Santa&nbsp;Cruz, advised by professor Manfred Warmuth, where he received the Best Dissertation Award for his work on sampling methods in statistical learning. Michał&#39;s current&nbsp;research is focused on developing scalable randomized algorithms with robust statistical guarantees for machine learning, data science and optimization. His work on&nbsp;reducing the cost of interpretability in dimensionality reduction received the Best Paper Award at the Thirty-fourth Conference on Neural Information Processing Systems.&nbsp;More information is available at:&nbsp;<a href="https://nam12.safelinks.protection.outlook.com/?url=https:%2F%2Fusers.soe.ucsc.edu%2F~mderezin%2F&amp;data=04%7C01%7Ckristen.perez%40cc.gatech.edu%7C0f21edb0c65040f4dd4708d8c3b6b65b%7C482198bbae7b4b258b7a6d7f32faa083%7C0%7C0%7C637474536796511437%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=xhB68AdmCE13O5NY76NCu01LkLUsiWCPp3i18cAK7DA%3D&amp;reserved=0">https://users.soe.ucsc.edu/~mderezin/</a>.</p>
]]></body>
  <field_summary_sentence>
    <item>
      <value><![CDATA[CSE Seminar with University of California, Berkeley Department of Statistics Post Doc Michał Dereziński ]]></value>
    </item>
  </field_summary_sentence>
  <field_summary>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_summary>
  <field_time>
    <item>
      <value><![CDATA[2021-02-09T11:00:00-05:00]]></value>
      <value2><![CDATA[2021-02-09T12:00:00-05:00]]></value2>
      <rrule><![CDATA[]]></rrule>
      <timezone><![CDATA[America/New_York]]></timezone>
    </item>
  </field_time>
  <field_fee>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_fee>
  <field_extras>
      </field_extras>
  <field_audience>
          <item>
        <value><![CDATA[Faculty/Staff]]></value>
      </item>
          <item>
        <value><![CDATA[Public]]></value>
      </item>
          <item>
        <value><![CDATA[Undergraduate students]]></value>
      </item>
      </field_audience>
  <field_media>
      </field_media>
  <field_contact>
    <item>
      <value><![CDATA[<p>Kristen Perez</p>

<p>kristen.perez@cc.gatech.edu</p>
]]></value>
    </item>
  </field_contact>
  <field_location>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_location>
  <field_sidebar>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_sidebar>
  <field_phone>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_phone>
  <field_url>
    <item>
      <url><![CDATA[]]></url>
      <title><![CDATA[]]></title>
            <attributes><![CDATA[]]></attributes>
    </item>
  </field_url>
  <field_email>
    <item>
      <email><![CDATA[]]></email>
    </item>
  </field_email>
  <field_boilerplate>
    <item>
      <nid><![CDATA[]]></nid>
    </item>
  </field_boilerplate>
  <links_related>
      </links_related>
  <files>
      </files>
  <og_groups>
          <item>47223</item>
          <item>50877</item>
      </og_groups>
  <og_groups_both>
          <item><![CDATA[College of Computing]]></item>
          <item><![CDATA[School of Computational Science and Engineering]]></item>
      </og_groups_both>
  <field_categories>
          <item>
        <tid>1795</tid>
        <value><![CDATA[Seminar/Lecture/Colloquium]]></value>
      </item>
      </field_categories>
  <field_keywords>
      </field_keywords>
  <userdata><![CDATA[]]></userdata>
</node>
