<node id="670761">
  <nid>670761</nid>
  <type>event</type>
  <uid>
    <user id="27707"><![CDATA[27707]]></user>
  </uid>
  <created>1698672579</created>
  <changed>1698672579</changed>
  <title><![CDATA[PhD Defense by Evan Downing]]></title>
  <body><![CDATA[<p><span><span><strong><span><span><span>Title:</span></span></span></strong><span><span><span>&nbsp;Improving the Understanding of Malware using Machine Learning</span></span></span></span></span></p>

<p><span><span><strong><span><span><span>Date:</span></span></span></strong><span><span><span>&nbsp;Friday, November 17th 2023</span></span></span></span></span></p>

<p><span><span><strong><span><span><span>Time:</span></span></span></strong><span><span><span>&nbsp;11:30 AM -- 1:00 PM EST</span></span></span></span></span></p>

<p><span><span><strong><span><span><span>Location:</span></span></span></strong><span><span><span>&nbsp;Coda C0903 Ansley</span></span></span></span></span></p>

<p><span><span><strong><span><span><span>Zoom link:</span></span></span></strong><span><span><span>&nbsp;<a href="https://gatech.zoom.us/j/98812589751?pwd=ZkxPUGVHWmVNTi8raFc2UlJGY3kzZz09">https://gatech.zoom.us/j/98812589751?pwd=ZkxPUGVHWmVNTi8raFc2UlJGY3kzZz09</a></span></span></span></span></span></p>

<p>&nbsp;</p>

<p><span><span><strong><span><span><span>Evan Downing</span></span></span></strong></span></span></p>

<p><span><span><span><span><span>Ph.D. Candidate in Computer Science</span></span></span></span></span></p>

<p><span><span><span><span><span>School of Cybersecurity and Privacy</span></span></span></span></span></p>

<p><span><span><span><span><span>Georgia Institute of Technology</span></span></span></span></span></p>

<p>&nbsp;</p>

<p><span><span><strong><span><span><span>Committee:</span></span></span></strong></span></span></p>

<p><span><span><span><span><span>Dr. Wenke Lee (advisor), School of Cybersecurity and Privacy, Georgia Institute of Technology</span></span></span></span></span></p>

<p><span><span><span><span><span>Dr. Mustaque Ahamad, School of Cybersecurity and Privacy, Georgia Institute of Technology</span></span></span></span></span></p>

<p><span><span><span><span><span>Dr. Brendan Saltaformaggio, School of Cybersecurity and Privacy, Georgia Institute of Technology</span></span></span></span></span></p>

<p><span><span><span><span><span>Dr. Fabian Monrose, School of Electrical and Computer Engineering, Georgia Institute of Technology</span></span></span></span></span></p>

<p><span><span><span><span><span>Dr. Frank Li, School of Cybersecurity and Privacy, Georgia Institute of Technology</span></span></span></span></span></p>

<p>&nbsp;</p>

<p><span><span><strong><span><span><span>Abstract:</span></span></span></strong></span></span></p>

<p><span><span><span><span><span>Malicious software continues to threaten users who rely on computational devices.</span></span></span></span></span></p>

<p><span><span><span><span><span>From destruction to the monetization of their victims’ information, malware authors seek</span></span></span></span></span></p>

<p><span><span><span><span><span>to cause harm for their personal gain. Over the past few decades, automated solutions have</span></span></span></span></span></p>

<p><span><span><span><span><span>been developed to catch and prevent malicious code from infecting and spreading through-</span></span></span></span></span></p>

<p><span><span><span><span><span>out cyberspace. These solutions often rely on statistical properties of what distinguishes</span></span></span></span></span></p>

<p><span><span><span><span><span>malware from goodware. However, these solutions are also seen as blackbox, forcing mal-</span></span></span></span></span></p>

<p><span><span><span><span><span>ware analysts to trust the models’ verdicts without allowing them to provide feedback from</span></span></span></span></span></p>

<p><span><span><span><span><span>their own domain knowledge and expertise.</span></span></span></span></span></p>

<p>&nbsp;</p>

<p><span><span><span><span><span>To address these challenges, I propose using humans-in-the-loop design with Machine</span></span></span></span></span></p>

<p><span><span><span><span><span>Learning (ML), which combines the best of both worlds by allowing expert analysts to both</span></span></span></span></span></p>

<p><span><span><span><span><span>learn new insights from the results of malware detection models and provide feedback to</span></span></span></span></span></p>

<p><span><span><span><span><span>improve the results of those models. This leads to a partnership, rather than a competi-</span></span></span></span></span></p>

<p><span><span><span><span><span>tion between humans and algorithms. I first introduce DeepReflect, a deep learning</span></span></span></span></span></p>

<p><span><span><span><span><span>system which identifies malicious functionality statically within malware binaries –</span></span></span></span></span></p>

<p><span><span><span><span><span>allowing analysts to label clusters of similar functionality in a semi-supervised approach.</span></span></span></span></span></p>

<p><span><span><span><span><span>DeepReflect increases the Area Under the Curve (AUC) value by 6-10% compared to</span></span></span></span></span></p>

<p><span><span><span><span><span>four state-of-the-art approaches on a dataset of 36k unique, unpacked malware binaries.</span></span></span></span></span></p>

<p><span><span><span><span><span>This helps analysts understand what a malware is capable of doing before they execute it.</span></span></span></span></span></p>

<p><span><span><span><span><span>Next, I introduce BCRAFTY, a system which automatically creates dynamic analysis be-</span></span></span></span></span></p>

<p><span><span><span><span><span>havior combinations to improve detecting malware: increasing True Positive Rate (TPR)</span></span></span></span></span></p>

<p><span><span><span><span><span>by 7.5% while keeping the False Positive Rate (FPR) near 0.3% compared to using analyst-</span></span></span></span></span></p>

<p><span><span><span><span><span>defined behaviors alone. The system allows analysts to learn new behaviors not previously</span></span></span></span></span></p>

<p><span><span><span><span><span>considered, increasing their understanding of how to improve malware detection, and give</span></span></span></span></span></p>

<p><span><span><span><span><span>feedback by accepting or rejecting suggested behavior combinations for the model to use.</span></span></span></span></span></p>
]]></body>
  <field_summary_sentence>
    <item>
      <value><![CDATA[ Improving the Understanding of Malware using Machine Learning]]></value>
    </item>
  </field_summary_sentence>
  <field_summary>
    <item>
      <value><![CDATA[<p><span><span><span>&nbsp;Improving the Understanding of Malware using Machine Learning</span></span></span></p>
]]></value>
    </item>
  </field_summary>
  <field_time>
    <item>
      <value><![CDATA[2023-11-17T11:30:00-05:00]]></value>
      <value2><![CDATA[2023-11-17T13:00:00-05:00]]></value2>
      <rrule><![CDATA[]]></rrule>
      <timezone><![CDATA[America/New_York]]></timezone>
    </item>
  </field_time>
  <field_fee>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_fee>
  <field_extras>
      </field_extras>
  <field_audience>
          <item>
        <value><![CDATA[Public]]></value>
      </item>
      </field_audience>
  <field_media>
      </field_media>
  <field_contact>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_contact>
  <field_location>
    <item>
      <value><![CDATA[Coda C0903 Ansley]]></value>
    </item>
  </field_location>
  <field_sidebar>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_sidebar>
  <field_phone>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_phone>
  <field_url>
    <item>
      <url><![CDATA[]]></url>
      <title><![CDATA[]]></title>
            <attributes><![CDATA[]]></attributes>
    </item>
  </field_url>
  <field_email>
    <item>
      <email><![CDATA[]]></email>
    </item>
  </field_email>
  <field_boilerplate>
    <item>
      <nid><![CDATA[]]></nid>
    </item>
  </field_boilerplate>
  <links_related>
      </links_related>
  <files>
      </files>
  <og_groups>
          <item>221981</item>
      </og_groups>
  <og_groups_both>
          <item><![CDATA[Graduate Studies]]></item>
      </og_groups_both>
  <field_categories>
          <item>
        <tid>1788</tid>
        <value><![CDATA[Other/Miscellaneous]]></value>
      </item>
      </field_categories>
  <field_keywords>
          <item>
        <tid>100811</tid>
        <value><![CDATA[Phd Defense]]></value>
      </item>
      </field_keywords>
  <userdata><![CDATA[]]></userdata>
</node>
