This application addresses broad Challenge Area (08) Genomics, and specific Challenge Topic, 08-OD-101, Computational approaches for epigenomic analysis. While the primary DNA sequence of the human genome is ultimately responsible for the encoding and functioning of each cell, a plethora of chromatin and DNA modifications have been described in recent years that can modulate the interpretation of this primary sequence. These epigenetic modifications lead to the diversity of function across different human cell types, and play key roles in the establishment and maintenance of cellular identity during development, and also in health and disease. The human ENCODE project, the NIH Epigenome Roadmap, and several other large-scale experimental efforts are currently underway to map dozens of histone and DNA modifications across multiple human cell types and disease states, generating a diversity of rich epigenomic datasets. This creates a pressing need for the development of rigorous computational methods for the systematic integrative analysis of epigenomic datasets, and for understanding their relationship to other genomic datasets, including gene expression, disease association, and phenotypic profiling. In this proposal, we will develop and apply graphical probabilistic models for describing chromatin modifications, based on multivariate hidden Markov models. We will use these models to discover the set of underlying chromatin state, based on recurrent combinations of epigenetic marks across the entire genome (Aim 1). We will validate and functionally characterize these states based on their enrichments and positional biases with respect to existing functional elements, as well as large-scale gene expression and disease association datasets (Aim 2). Lastly, we will extend these methods to study dynamics of chromatin state across both healthy and disease cell types, and study how these correlate with functional differences between the observed cell types (Aim 3). We will work closely with the scientists involved in data production and facilitate communication and data integration across them, and also with data analysis and coordination centers already established to facilitate sharing of methods and results across the ENCODE and Epigenome Roadmap consortia, and with the larger community. Overall, the proposed integrative analysis of large-scale genomic and epigenomic datasets will provide a unified view of current and planned epigenomic datasets, towards a systematic understanding of gene and genome regulation in health and disease. While the primary DNA sequence of the human genome is ultimately responsible for the encoding and functioning of each cell, a plethora of chromatin and DNA modifications have been described in recent years that can modulate the interpretation of this primary sequence, leading to the diversity of function across different human cell types. This project will create a computational framework and resource to integrate large-scale genomic and epigenomic datasets, to understand their functional role in health and disease, and to understand their dynamics across different cell lines and disease states. The knowledge gained can play key roles in understanding the establishment and maintenance of cellular identity during healthy development, and how dysregulation of these processes can lead to the onset of disease.

Public Health Relevance

While the primary DNA sequence of the human genome is ultimately responsible for the encoding and functioning of each cell, a plethora of chromatin and DNA modifications have been described in recent years that can modulate the interpretation of this primary sequence, leading to the diversity of function across different human cell types. This project will create a computational framework and resource to integrate large-scale genomic and epigenomic datasets, to understand their functional role in health and disease, and to understand their dynamics across different cell lines and disease states. The knowledge gained can play key roles in understanding the establishment and maintenance of cellular identity during healthy development, and how dysregulation of these processes can lead to the onset of disease.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
NIH Challenge Grants and Partnerships Program (RC1)
Project #
1RC1HG005334-01
Application #
7817501
Study Section
Special Emphasis Panel (ZRG1-GGG-F (58))
Program Officer
Good, Peter J
Project Start
2009-09-22
Project End
2011-07-31
Budget Start
2009-09-22
Budget End
2010-07-31
Support Year
1
Fiscal Year
2009
Total Cost
$471,240
Indirect Cost
Name
Massachusetts Institute of Technology
Department
Type
Organized Research Units
DUNS #
001425594
City
Cambridge
State
MA
Country
United States
Zip Code
02139
Ernst, Jason; Kellis, Manolis (2017) Chromatin-state discovery and genome annotation with ChromHMM. Nat Protoc 12:2478-2492
Ward, Lucas D; Kellis, Manolis (2016) HaploReg v4: systematic mining of putative causal variants, cell types, regulators and target genes for human complex traits and disease. Nucleic Acids Res 44:D877-81
Bekelis, Kimon; Kerley-Hamilton, Joanna S; Teegarden, Amy et al. (2016) MicroRNA and gene expression changes in unruptured human cerebral aneurysms. J Neurosurg 125:1390-1399
Gjoneska, Elizabeta; Pfenning, Andreas R; Mathys, Hansruedi et al. (2015) Conserved epigenomic signals in mice and humans reveal immune basis of Alzheimer's disease. Nature 518:365-9
Claussnitzer, Melina; Dankel, Simon N; Kim, Kyoung-Han et al. (2015) FTO Obesity Variant Circuitry and Adipocyte Browning in Humans. N Engl J Med 373:895-907
Ernst, Jason; Kellis, Manolis (2015) Large-scale imputation of epigenomic datasets for systematic annotation of diverse human tissues. Nat Biotechnol 33:364-76
Roadmap Epigenomics Consortium; Kundaje, Anshul; Meuleman, Wouter et al. (2015) Integrative analysis of 111 reference human epigenomes. Nature 518:317-30
Yen, Angela; Kellis, Manolis (2015) Systematic chromatin state comparison of epigenomes associated with diverse properties including sex and tissue type. Nat Commun 6:7973
Madabhushi, Ram; Gao, Fan; Pfenning, Andreas R et al. (2015) Activity-Induced DNA Breaks Govern the Expression of Neuronal Early-Response Genes. Cell 161:1592-605
Ozel, A Bilge; Moroi, Sayoko E; Reed, David M et al. (2014) Genome-wide association study and meta-analysis of intraocular pressure. Hum Genet 133:41-57

Showing the most recent 10 out of 24 publications