This project will develop computer programs to exploit the Human Microbiome Project (HMP) DNA sequences to better understand DNA-protein interactions. The interactions between transcription factors and the DNA sites that they bind to are critical to controlling the expression of the genes within each species, and therefore also the characteristics of each species and its interactions with the human host. The transcription factors themselves can be readily identified from DNA sequences and we will take advantage of the fact that most bacterial transcription factors regulate themselves and/or adjacent genes within their chromosomes. Transcription factors can be clustered into groups that are expected to recognize the same patterns of DNA, based on known structures for similar proteins from well studied bacteria. Together the clusters of proteins with very similar specificity and the probable regulatory regions of nearby promoters will give us a very large number of potential DNA-protein interacting sites on which to apply pattern discovery algorithms. This should not only help us to learn about the regulatory networks within the HMP species, but also lead to more general understanding about the relationships between transcription factor proteins and the DNA patterns that they recognize. This will have broader implications across several areas of biological research and may lead to the design of new proteins with novel specificities that could be useful as research tools and for therapeutics.

Public Health Relevance

The Human Microbiome Project will obtain DNA sequences from many different species inhabiting many different microenvironments of the human body. This project will develop computer programs to analyze those DNA sequences to help discover how the expression of the genes in those species is regulated. The regulation of gene expression is a key element in understanding the interactions between the microbial communities and the human host.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Exploratory/Developmental Grants (R21)
Project #
5R21HG005970-02
Application #
8149991
Study Section
Special Emphasis Panel (ZRG1-GGG-N (50))
Program Officer
Proctor, Lita
Project Start
2010-09-27
Project End
2014-06-30
Budget Start
2011-07-01
Budget End
2014-06-30
Support Year
2
Fiscal Year
2011
Total Cost
$188,100
Indirect Cost
Name
Washington University
Department
Genetics
Type
Schools of Medicine
DUNS #
068552207
City
Saint Louis
State
MO
Country
United States
Zip Code
63130
Stormo, Gary D; Zuo, Zheng; Chang, Yiming Kenny (2015) Spec-seq: determining protein-DNA-binding specificity by sequencing. Brief Funct Genomics 14:30-8
Zuo, Zheng; Stormo, Gary D (2014) High-resolution specificity from DNA sequencing highlights alternative modes of Lac repressor binding. Genetics 198:1329-43