To realize the promise of the human genome project, we need not only the parts list of all the genes, but also a comprehensive understanding of how they function together. Along with genes, our genome contains all the signals necessary for controlling gene expression in response to environmental and developmental stimuli. These regulatory processes are governed by short sequence motifs, responsible for modulating gene usage at every level. Despite their prevalence, regulatory motifs have been particularly challenging to identify, due to their short length and the varying distances at which they can act. Given their extraordinary importance, their systematic understanding still remains one of the major challenges of modern biology. In the proposed work, we use comparative genomics of multiple mammals to systematically identify and characterize regulatory motifs in the human genome based on their evolutionary conservation. We have pioneered a new powerful approach for de novo motif discovery by using genome-wide conservation, and successfully applied it in four yeast genomes, twelve fly genomes, and human promoters and 3'-UTRs. Here we expand this methodology to undertake motif discovery across the entire human genome: (1) we develop methods that use dozens of mammalian species for motif discovery and characterization; (2) we identify significant motif combinations and grammars and reveal their functional roles; and (3) we discover functional regions of motif clustering and study motif role in specifying enhancer function. The proposed work is timely, given that NHGRI's sequencing efforts now encompass more than 30 mammalian genomes, specifically for understanding the human. Moreover, large-scale systematic experimentation is providing the functional information necessary to inform and validate our findings. By revealing the underlying sequence patterns that govern gene usage, we complement these ongoing efforts and provide access to the concrete building blocks of human gene regulation. This will enable researchers world-wide to link new genes in pathways by their co-regulation, elucidate the role of non- coding SNPs in regulatory diseases, and lead to new tests and therapeutics for modern medicine. A global map of regulatory motifs constitutes a necessary knowledge infrastructure towards a comprehensive understanding of regulation, development, and disease. ? ? ?

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-GGG-D (90))
Program Officer
Good, Peter J
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Massachusetts Institute of Technology
Organized Research Units
United States
Zip Code
Loughran, Gary; Jungreis, Irwin; Tzani, Ioanna et al. (2018) Stop codon readthrough generates a C-terminally extended variant of the human vitamin D receptor with reduced calcitriol response. J Biol Chem 293:4434-4444
Ernst, Jason; Melnikov, Alexandre; Zhang, Xiaolan et al. (2016) Genome-scale high-resolution mapping of activating and repressive nucleotides in regulatory regions. Nat Biotechnol 34:1180-1190
Jungreis, Irwin; Chan, Clara S; Waterhouse, Robert M et al. (2016) Evolutionary Dynamics of Abundant Stop Codon Readthrough. Mol Biol Evol 33:3108-3132
Wang, Xinchen; Tucker, Nathan R; Rizki, Gizem et al. (2016) Discovery and validation of sub-threshold genome-wide association study loci using epigenomic signatures. Elife 5:
Marbach, Daniel; Lamparter, David; Quon, Gerald et al. (2016) Tissue-specific regulatory circuits reveal variable modular perturbations across complex diseases. Nat Methods 13:366-70
Ward, Lucas D; Kellis, Manolis (2016) HaploReg v4: systematic mining of putative causal variants, cell types, regulators and target genes for human complex traits and disease. Nucleic Acids Res 44:D877-81
Bekelis, Kimon; Kerley-Hamilton, Joanna S; Teegarden, Amy et al. (2016) MicroRNA and gene expression changes in unruptured human cerebral aneurysms. J Neurosurg 125:1390-1399
Ma, Jiao; Diedrich, Jolene K; Jungreis, Irwin et al. (2016) Improved Identification and Analysis of Small Open Reading Frame Encoded Polypeptides. Anal Chem 88:3967-75
Elliott, GiNell; Hong, Chibo; Xing, Xiaoyun et al. (2015) Intermediate DNA methylation is a conserved signature of genome regulation. Nat Commun 6:6363
Gjoneska, Elizabeta; Pfenning, Andreas R; Mathys, Hansruedi et al. (2015) Conserved epigenomic signals in mice and humans reveal immune basis of Alzheimer's disease. Nature 518:365-9

Showing the most recent 10 out of 101 publications