<node id="394591">
  <nid>394591</nid>
  <type>event</type>
  <uid>
    <user id="27707"><![CDATA[27707]]></user>
  </uid>
  <created>1428492813</created>
  <changed>1475892695</changed>
  <title><![CDATA[PhD Defense by Tonya Woods]]></title>
  <body><![CDATA[<p>Title: &nbsp;<strong>Extracting Meaningful Statistics for the Characterization and Classification of Biological, Medical, and Financial Data</strong></p><p>&nbsp;</p><p>Advisor:&nbsp; Professor Brani Vidakovic</p><p>&nbsp;</p><p>Committee members:&nbsp; Professor Yajun Mei, Professor Kamran Paynabar, Professor Mirjana Milosevic-Brockett (School of Biology), Dr. Scott Nickleach (Equifax INC)</p><p>&nbsp;</p><p>Date and time:&nbsp; Friday, May 1, 2015, 10:00 AM</p><p>&nbsp;</p><p>Location:&nbsp; ISyE Groseclose, Room 226A</p><p>&nbsp;</p><p><strong>Abstract:</strong></p><p>&nbsp;</p><p>This thesis is focused on extracting meaningful statistics for the characterization and classification of biological, medical, and financial data and contains four chapters.&nbsp; The first chapter contains theoretical background on scaling and wavelets, which supports the work in chapters two and three.</p><p>&nbsp;</p><p>In the second chapter, we outline a methodology for representing sequences of DNA nucleotides as numeric matrices in order to analytically investigate important structural characteristics of DNA.&nbsp; This methodology involves assigning unit vectors to nucleotides, placing the vectors into columns of a matrix, and accumulating across the rows of this matrix.&nbsp; Transcribing the DNA in this way allows us to compute the 2-D wavelet transformation and assess regularity characteristics of the sequence via the slope of the wavelet spectra.&nbsp; In addition to computing a global slope measure for a sequence, we can apply our methodology for overlapping sections of nucleotides to obtain an evolutionary slope.</p><p>&nbsp;</p><p>In the third chapter, we describe various ways wavelet-based scaling may be used for cancer diagnostics.&nbsp; There were nearly half of a million new cases of ovarian, breast, and lung cancer in the United States last year.&nbsp; Breast and lung cancer have highest prevalence, while ovarian cancer has the lowest survival rate of the three.&nbsp; Early detection is critical for all of these diseases, but substantial obstacles to early detection exist in each case.&nbsp; In this work, we use wavelet-based scaling on metabolic data and radiography images in order to produce meaningful features to be used in classifying cases and controls.&nbsp; Computer-aided detection (CAD) algorithms for detecting lung and breast cancer often focus on select features in an image and make a priori assumptions about the nature of a nodule or a mass.&nbsp; In contrast, our approach to analyzing breast and lung images captures information contained in the background tissue of images as well as information about specific features and makes no such a priori assumptions.</p><p>&nbsp;</p><p>In the fourth chapter, we investigate the value of social media data in building commercial default and activity credit models.&nbsp; We use random forest modeling, which has been shown in many instances to achieve better predictive accuracy than logistic regression in modeling credit data.&nbsp; This result is of interest, as some entities are beginning to build credit scores based on this type of publicly available online data alone.&nbsp; Our work has shown that the addition of social media data does not provide any improvement in model accuracy over the bureau only models.&nbsp; However, the social media data on its own does have some limited predictive power.</p><p>&nbsp;</p>]]></body>
  <field_summary_sentence>
    <item>
      <value><![CDATA[Extracting Meaningful Statistics for the Characterization and Classification of Biological, Medical, and Financial Data]]></value>
    </item>
  </field_summary_sentence>
  <field_summary>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_summary>
  <field_time>
    <item>
      <value><![CDATA[2015-05-01T11:00:00-04:00]]></value>
      <value2><![CDATA[2015-05-01T13:00:00-04:00]]></value2>
      <rrule><![CDATA[]]></rrule>
      <timezone><![CDATA[America/New_York]]></timezone>
    </item>
  </field_time>
  <field_fee>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_fee>
  <field_extras>
      </field_extras>
  <field_audience>
          <item>
        <value><![CDATA[Public]]></value>
      </item>
      </field_audience>
  <field_media>
      </field_media>
  <field_contact>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_contact>
  <field_location>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_location>
  <field_sidebar>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_sidebar>
  <field_phone>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_phone>
  <field_url>
    <item>
      <url><![CDATA[]]></url>
      <title><![CDATA[]]></title>
            <attributes><![CDATA[]]></attributes>
    </item>
  </field_url>
  <field_email>
    <item>
      <email><![CDATA[]]></email>
    </item>
  </field_email>
  <field_boilerplate>
    <item>
      <nid><![CDATA[]]></nid>
    </item>
  </field_boilerplate>
  <links_related>
      </links_related>
  <files>
      </files>
  <og_groups>
          <item>221981</item>
      </og_groups>
  <og_groups_both>
          <item><![CDATA[Graduate Studies]]></item>
      </og_groups_both>
  <field_categories>
          <item>
        <tid>1788</tid>
        <value><![CDATA[Other/Miscellaneous]]></value>
      </item>
      </field_categories>
  <field_keywords>
          <item>
        <tid>1366</tid>
        <value><![CDATA[defense]]></value>
      </item>
          <item>
        <tid>1808</tid>
        <value><![CDATA[graduate students]]></value>
      </item>
          <item>
        <tid>121301</tid>
        <value><![CDATA[graduate students. defense. PhD.]]></value>
      </item>
      </field_keywords>
  <userdata><![CDATA[]]></userdata>
</node>
