To realize the promise of the human genome project, we need not only the parts list of all the genes, but also a comprehensive understanding of how they function together. Along with genes, our genome contains all the signals necessary for controlling gene expression in response to environmental and developmental stimuli. These regulatory processes are governed by short sequence motifs, responsible for modulating gene usage at every level. Despite their prevalence, regulatory motifs have been particularly challenging to identify, due to their short length and the varying distances at which they can act. Given their extraordinary importance, their systematic understanding still remains one of the major challenges of modern biology. In the proposed work, we use comparative genomics of multiple mammals to systematically identify and characterize regulatory motifs in the human genome based on their evolutionary conservation. We have pioneered a new powerful approach for de novo motif discovery by using genome-wide conservation, and successfully applied it in four yeast genomes, twelve fly genomes, and human promoters and 3'-UTRs. Here we expand this methodology to undertake motif discovery across the entire human genome: (1) we develop methods that use dozens of mammalian species for motif discovery and characterization;(2) we identify significant motif combinations and grammars and reveal their functional roles;and (3) we discover functional regions of motif clustering and study motif role in specifying enhancer function. The proposed work is timely, given that NHGRI's sequencing efforts now encompass more than 30 mammalian genomes, specifically for understanding the human. Moreover, large-scale systematic experimentation is providing the functional information necessary to inform and validate our findings. By revealing the underlying sequence patterns that govern gene usage, we complement these ongoing efforts and provide access to the concrete building blocks of human gene regulation. This will enable researchers world-wide to link new genes in pathways by their co-regulation, elucidate the role of non- coding SNPs in regulatory diseases, and lead to new tests and therapeutics for modern medicine. A global map of regulatory motifs constitutes a necessary knowledge infrastructure towards a comprehensive understanding of regulation, development, and disease.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG004037-07
Application #
8541874
Study Section
Special Emphasis Panel (ZRG1-GGG-D (90))
Program Officer
Pazin, Michael J
Project Start
2007-09-28
Project End
2015-10-31
Budget Start
2013-09-01
Budget End
2014-08-31
Support Year
7
Fiscal Year
2013
Total Cost
$395,908
Indirect Cost
$159,545
Name
Massachusetts Institute of Technology
Department
Type
Organized Research Units
DUNS #
001425594
City
Cambridge
State
MA
Country
United States
Zip Code
02139
Marbach, Daniel; Lamparter, David; Quon, Gerald et al. (2016) Tissue-specific regulatory circuits reveal variable modular perturbations across complex diseases. Nat Methods 13:366-70
Ma, Jiao; Diedrich, Jolene K; Jungreis, Irwin et al. (2016) Improved Identification and Analysis of Small Open Reading Frame Encoded Polypeptides. Anal Chem 88:3967-75
Ward, Lucas D; Kellis, Manolis (2016) HaploReg v4: systematic mining of putative causal variants, cell types, regulators and target genes for human complex traits and disease. Nucleic Acids Res 44:D877-81
Bekelis, Kimon; Kerley-Hamilton, Joanna S; Teegarden, Amy et al. (2016) MicroRNA and gene expression changes in unruptured human cerebral aneurysms. J Neurosurg 125:1390-1399
Jungreis, Irwin; Chan, Clara S; Waterhouse, Robert M et al. (2016) Evolutionary Dynamics of Abundant Stop Codon Readthrough. Mol Biol Evol 33:3108-3132
Ernst, Jason; Melnikov, Alexandre; Zhang, Xiaolan et al. (2016) Genome-scale high-resolution mapping of activating and repressive nucleotides in regulatory regions. Nat Biotechnol 34:1180-1190
Ernst, Jason; Kellis, Manolis (2015) Large-scale imputation of epigenomic datasets for systematic annotation of diverse human tissues. Nat Biotechnol 33:364-76
Gjoneska, Elizabeta; Pfenning, Andreas R; Mathys, Hansruedi et al. (2015) Conserved epigenomic signals in mice and humans reveal immune basis of Alzheimer's disease. Nature 518:365-9
Chibnik, Lori B; Yu, Lei; Eaton, Matthew L et al. (2015) Alzheimer's loci: epigenetic associations and interaction with genetic factors. Ann Clin Transl Neurol 2:636-47
Pierson, Emma; GTEx Consortium; Koller, Daphne et al. (2015) Sharing and Specificity of Co-expression Networks across 35 Human Tissues. PLoS Comput Biol 11:e1004220

Showing the most recent 10 out of 100 publications