The NIH Roadmap Epigenomics and ENCODE projects have generated a collection of 3000+ epigenomics datasets, including histone modification, DNA methylation, gene expression, and DNaseI hypersensitivity profiled across 190 cell and tissue types. In order to maximize its impact on gene regulation, cellular differentiation, and human health, novel computational analyses are needed. To address this challenge, we will develop new methods for epigenomic analysis, building on our extensive experience interpreting epigenomic information, and our preliminary studies building chromatin states, activity clusters, and regulatory motif maps for the Roadmap Epigenomics and ENCODE datasets.
In Aim 1, we will characterize epigenomic differences and changes during lineage differentiation by developing new tools for systematic comparison of groups of epigenomes that directly exploit the complexity of epigenomic datasets; we will also develop methods for clustering epigenomes into developmental lineages based on automatically-learned diverse epigenomic features that distinguish them; and methods that learn the unidirectional epigenomic changes that pluripotent cells undergo during lineage commitment to gain more insights into differentiation and automatically learn to classify lineages and differentiation trajectories. In Am 2, we will seek to characterize higher-order chromatin architecture and chromatin conformation to enable systematic interpretation of cis-regulatory modules: we will develop a novel statistical approach for enhancer-enhancer and enhancer-gene linking to reveal interacting regions and their target genes based on their coordinated activity patterns across cell and tissue types; we will train a supervised learning method for predicting both constitutive and tissue-specific chromatin conformation information based on chromatin state information, individual chromatin marks, genomic distance, activity, regulatory motif information, and DNA sequence; and we will use these higher-order interaction maps to predict gene expression levels based on the combined action of multiple regulatory regions and to define the cis-regulatory architecture of each gene in the human genome. The resulting resources will be invaluable for studies of gene regulation, by revealing the set of regulatory elements that are linked to each gene, and for the interpretation of genetic studies, by revealing the set of regulatory elements which jointly act to regulate each target gene and the potential target genes of non-coding variants associated with human disease.

Public Health Relevance

The NIH ENCODE and Roadmap Epigenomics projects have generated a collection of 4000+ epigenomics datasets across 200+ cell and tissue types, which can be invaluable to the scientific community, but novel computational analyses are needed to maximize its impact on gene regulation, cellular differentiation, and human health. To address this need, we propose to systematically study epigenomic differences between cell types, lineages, and stages of differentiation, and to learn models predicting the higher-order chromatin structures across cell/tissue types, within each cell/tissue, and the chromatin architectures that they define. By systematically interpreting epigenomic annotations in the context of their cellular differentiation and their higher-order chromatin structures, the resultin resource will greatly increase their impact on human health by enabling disease studies to understand the mechanistic relationship between non-coding variants and their target genes, and the specific differentiation stages in which they act.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
1R01GM113708-01
Application #
8847548
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Ravichandran, Veerasamy
Project Start
2015-04-01
Project End
2018-03-31
Budget Start
2015-04-01
Budget End
2016-03-31
Support Year
1
Fiscal Year
2015
Total Cost
Indirect Cost
Name
Massachusetts Institute of Technology
Department
Type
Organized Research Units
DUNS #
001425594
City
Cambridge
State
MA
Country
United States
Zip Code
Onuchic, Vitor; Lurie, Eugene; Carrero, Ivenise et al. (2018) Allele-specific epigenome maps reveal sequence-dependent stochastic switching at regulatory loci. Science 361:
Wang, Yang; Li, Yue; Yue, Minghui et al. (2018) N6-methyladenosine RNA modification regulates embryonic neural stem cell self-renewal through histone modifications. Nat Neurosci 21:195-206
Miyamoto, Kei; Nguyen, Khoi T; Allen, George E et al. (2018) Chromatin Accessibility Impacts Transcriptional Reprogramming in Oocytes. Cell Rep 24:304-311
Marco, Eugenio; Meuleman, Wouter; Huang, Jialiang et al. (2017) Multi-scale chromatin state annotation using a hierarchical hidden Markov model. Nat Commun 8:15011
Liu, Yaping; Sarkar, Abhishek; Kheradpour, Pouya et al. (2017) Evidence of reduced recombination rate in human regulatory domains. Genome Biol 18:193
Kreimer, Anat; Zeng, Haoyang; Edwards, Matthew D et al. (2017) Predicting gene expression in massively parallel reporter assays: A comparative study. Hum Mutat 38:1240-1250
Le Gros, Mark A; Clowney, E Josephine; Magklara, Angeliki et al. (2016) Soft X-Ray Tomography Reveals Gradual Chromatin Compaction and Reorganization during Neurogenesis In Vivo. Cell Rep 17:2125-2136
Ernst, Jason; Melnikov, Alexandre; Zhang, Xiaolan et al. (2016) Genome-scale high-resolution mapping of activating and repressive nucleotides in regulatory regions. Nat Biotechnol 34:1180-1190
Wang, Xinchen; Tucker, Nathan R; Rizki, Gizem et al. (2016) Discovery and validation of sub-threshold genome-wide association study loci using epigenomic signatures. Elife 5:
Marbach, Daniel; Lamparter, David; Quon, Gerald et al. (2016) Tissue-specific regulatory circuits reveal variable modular perturbations across complex diseases. Nat Methods 13:366-70

Showing the most recent 10 out of 16 publications