Data Mining for Functional Genomics and Metagenomics -- TALK

Curtis Huttenhower

Assistant Professor of Computational Biology and Bioinformatics

Department of
Biostatistics
, HSPH

Data mining for functional genomics and metagenomics

Bioinformatics in the context of public health is needed at a wide range of biological scales:
molecular data describing cellular function, population studies incorporating
genomic data, and the systems biology tying together these extremes.  At
all of these levels, the scale of available data is large; public repositories
of genomic data currently contain billions of experimental results from a
variety of assays. While modern search engines have organized the size and
heterogeneity of other complex systems such as the Internet, it remains an open
question how machine learning can be used to mine large genomic data
collections for answers to specific biological questions.

 

Curtis will discuss two algorithmic approaches to large scale human genomic data integration, both of
which leverage tens of thousands of datasets to predict interaction networks,
disease linkages, and regulatory modules. He will also present preliminary
results applying this methodology to study genetic and epigenetic variation in
a  ~1,000-subject colorectal cancer cohort.  Finally, he will briefly
discuss data integration in the context of metagenomics, the study of
uncultured microorganisms from environmental samples.  This emerging
data-rich field presents a unique opportunity to bring large scale data
integration to bear, particularly in the context of human microflora and their
impact on health within hosts and across populations.

 

 


Leah Segal
Project Coordinator
Department of Biostatistics
Harvard School of Public Health
655 Huntington Avenue
BostonMA  02115
(617) 432. 7779 | Telephone
(617) 432. 5619 | Fax
lsegal@hsph.harvard.edu