Project Title: "Predicting a genomic 'grammar' using new computational methods"

This project is awarded under the Postdoctoral Research Fellowships in Biological Informatics Program for 2006. The transcriptional regulation of gene expression is of major importance to most biological processes. The expression patterns of tissue- or cell-type-specific genes are typically complex. In many cases different aspects of the patterns are controlled by separate cis-regulatory modules (CRMs), also known as transcriptional enhancers. The aim of this project is to develop and apply computational methods for predicting classes of CRMs and their grammar that drive the expression patterns of tissue or cell-type specific genes. This research focuses on the mechanisms of regulation of two systems: gap/segmentation genes and genes expressed in the embryonic mesodermal founder cells (FC) of Drosophila melanogaster. The Fellow plans to improve existing algorithms for searching candidate CRMs and introduce alternative algorithms based on the recently developed computational techniques to further this search. Specifically, the Fellow will apply unsupervised and statistical learning methods for determining statistically significant clusters of candidate CRMs. By applying statistical learning and Bayesian approaches, the clusters of CRMs will be searched for patterns determining statistically significant classes of CRMs. Novel approaches from transductive statistical learning will be taken. From the results obtained above, transcriptional models comprising different combinations of cell type-specific TFs and their cis-regulatory sequence will be deduced. The results from these experiments will provide more data on real enhancers and their grammar, on which the algorithms can then be retrained for improved performance.

The Fellow will be conducting this research in the laboratory of Dr. Martha Bulyk at Brigham and Woman's Hospital, Harvard. The broader impacts of the project are both scientific and educational. The proposed methods can be applied to any biological system and are helpful for better characterizing many transcriptional regulatory networks. The Fellow will mentor both undergraduate and graduate students in statistical and machine-learning methods, and engage them in the software design. The software will be open-source and made available to educators on a public website.

Agency
National Science Foundation (NSF)
Institute
Division of Biological Infrastructure (DBI)
Application #
0630755
Program Officer
Carter Kimsey
Project Start
Project End
Budget Start
2006-10-01
Budget End
2008-09-30
Support Year
Fiscal Year
2006
Total Cost
$120,000
Indirect Cost
Name
Jaeger, Savina A
Department
Type
DUNS #
City
Cambridge
State
MA
Country
United States
Zip Code
02138