I propose new knowledge-based approaches for determining both the short sequences of DNA to which transcription factors bind and the ways in which the context of these sequences specify regulatory control. I will apply these approaches to recently produced binding data from chromatin-immunoprecipitation microarray efforts in S. cerevisiae. First, I will incorporate information from three-dimensional structures of protein-DNA complexes into motif discovery algorithms to identify over-represented sequences in the bound intergenic regions. Second, I will use machine learning techniques to apply multiple criteria based on biological knowledge of promoter organization to identify which of the binding sites predicted by these algorithms are genuine. As a first application of new motifs found using these approaches, I will perform hypothesis-directed sequence analysis to reconcile differences between in vivo binding and the organization of binding sites in promoters. Following experimental validation, the discovered binding motifs and organizational rules will provide the basis for a deeper understanding of the underlying logic of promoter organization, and will have direct impact on the study of transcriptional regulatory mechanisms in higher eukaryotes.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Postdoctoral Individual National Research Service Award (F32)
Project #
5F32GM068278-02
Application #
6841605
Study Section
Special Emphasis Panel (ZRG1-F05 (20))
Program Officer
Portnoy, Matthew
Project Start
2004-01-01
Project End
2005-02-20
Budget Start
2005-01-01
Budget End
2005-02-20
Support Year
2
Fiscal Year
2005
Total Cost
$7,362
Indirect Cost
Name
Whitehead Institute for Biomedical Research
Department
Type
DUNS #
120989983
City
Cambridge
State
MA
Country
United States
Zip Code
02142
Gordon, D Benjamin; Nekludova, Lena; McCallum, Scott et al. (2005) TAMO: a flexible, object-oriented framework for analyzing transcriptional regulation using DNA-sequence motifs. Bioinformatics 21:3164-5