Integrative analysis of genomic and epigenomic datasets in multiple cell types

Kellis, Manolis

Abstract

This application addresses broad Challenge Area (08) Genomics, and specific Challenge Topic, 08-OD-101, Computational approaches for epigenomic analysis. While the primary DNA sequence of the human genome is ultimately responsible for the encoding and functioning of each cell, a plethora of chromatin and DNA modifications have been described in recent years that can modulate the interpretation of this primary sequence. These epigenetic modifications lead to the diversity of function across different human cell types, and play key roles in the establishment and maintenance of cellular identity during development, and also in health and disease. The human ENCODE project, the NIH Epigenome Roadmap, and several other large-scale experimental efforts are currently underway to map dozens of histone and DNA modifications across multiple human cell types and disease states, generating a diversity of rich epigenomic datasets. This creates a pressing need for the development of rigorous computational methods for the systematic integrative analysis of epigenomic datasets, and for understanding their relationship to other genomic datasets, including gene expression, disease association, and phenotypic profiling. In this proposal, we will develop and apply graphical probabilistic models for describing chromatin modifications, based on multivariate hidden Markov models. We will use these models to discover the set of underlying chromatin state, based on recurrent combinations of epigenetic marks across the entire genome (Aim 1). We will validate and functionally characterize these states based on their enrichments and positional biases with respect to existing functional elements, as well as large-scale gene expression and disease association datasets (Aim 2). Lastly, we will extend these methods to study dynamics of chromatin state across both healthy and disease cell types, and study how these correlate with functional differences between the observed cell types (Aim 3). We will work closely with the scientists involved in data production and facilitate communication and data integration across them, and also with data analysis and coordination centers already established to facilitate sharing of methods and results across the ENCODE and Epigenome Roadmap consortia, and with the larger community. Overall, the proposed integrative analysis of large-scale genomic and epigenomic datasets will provide a unified view of current and planned epigenomic datasets, towards a systematic understanding of gene and genome regulation in health and disease. While the primary DNA sequence of the human genome is ultimately responsible for the encoding and functioning of each cell, a plethora of chromatin and DNA modifications have been described in recent years that can modulate the interpretation of this primary sequence, leading to the diversity of function across different human cell types. This project will create a computational framework and resource to integrate large-scale genomic and epigenomic datasets, to understand their functional role in health and disease, and to understand their dynamics across different cell lines and disease states. The knowledge gained can play key roles in understanding the establishment and maintenance of cellular identity during healthy development, and how dysregulation of these processes can lead to the onset of disease.

Public Health Relevance

While the primary DNA sequence of the human genome is ultimately responsible for the encoding and functioning of each cell, a plethora of chromatin and DNA modifications have been described in recent years that can modulate the interpretation of this primary sequence, leading to the diversity of function across different human cell types. This project will create a computational framework and resource to integrate large-scale genomic and epigenomic datasets, to understand their functional role in health and disease, and to understand their dynamics across different cell lines and disease states. The knowledge gained can play key roles in understanding the establishment and maintenance of cellular identity during healthy development, and how dysregulation of these processes can lead to the onset of disease.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Human Genome Research Institute (NHGRI)
Type: NIH Challenge Grants and Partnerships Program (RC1)
Project #: 3RC1HG005334-02S1
Application #: 8337442
Study Section: Special Emphasis Panel (ZRG1-GGG-F (58))
Program Officer: Pazin, Michael J

Project Start: 2009-09-22
Project End: 2013-07-31
Budget Start: 2011-08-01
Budget End: 2013-07-31
Support Year: 2
Fiscal Year: 2011
Total Cost: $449,390
Indirect Cost

Institution

Name: Massachusetts Institute of Technology
Department
Type: Organized Research Units
DUNS #: 001425594

City: Cambridge
State: MA
Country: United States
Zip Code: 02139

Related projects


NIH 2011 RC1 HG	Integrative analysis of genomic and epigenomic datasets in multiple cell types Kellis, Manolis / Massachusetts Institute of Technology	$449,390
NIH 2010 RC1 HG	Integrative analysis of genomic and epigenomic datasets in multiple cell types Kellis, Manolis / Massachusetts Institute of Technology	$468,161
NIH 2009 RC1 HG	Integrative analysis of genomic and epigenomic datasets in multiple cell types Kellis, Manolis / Massachusetts Institute of Technology	$471,240

Publications

Ernst, Jason; Kellis, Manolis (2017) Chromatin-state discovery and genome annotation with ChromHMM. Nat Protoc 12:2478-2492

Bekelis, Kimon; Kerley-Hamilton, Joanna S; Teegarden, Amy et al. (2016) MicroRNA and gene expression changes in unruptured human cerebral aneurysms. J Neurosurg 125:1390-1399

Ward, Lucas D; Kellis, Manolis (2016) HaploReg v4: systematic mining of putative causal variants, cell types, regulators and target genes for human complex traits and disease. Nucleic Acids Res 44:D877-81

Gjoneska, Elizabeta; Pfenning, Andreas R; Mathys, Hansruedi et al. (2015) Conserved epigenomic signals in mice and humans reveal immune basis of Alzheimer's disease. Nature 518:365-9

Claussnitzer, Melina; Dankel, Simon N; Kim, Kyoung-Han et al. (2015) FTO Obesity Variant Circuitry and Adipocyte Browning in Humans. N Engl J Med 373:895-907

Ernst, Jason; Kellis, Manolis (2015) Large-scale imputation of epigenomic datasets for systematic annotation of diverse human tissues. Nat Biotechnol 33:364-76

Roadmap Epigenomics Consortium; Kundaje, Anshul; Meuleman, Wouter et al. (2015) Integrative analysis of 111 reference human epigenomes. Nature 518:317-30

Yen, Angela; Kellis, Manolis (2015) Systematic chromatin state comparison of epigenomes associated with diverse properties including sex and tissue type. Nat Commun 6:7973

Madabhushi, Ram; Gao, Fan; Pfenning, Andreas R et al. (2015) Activity-Induced DNA Breaks Govern the Expression of Neuronal Early-Response Genes. Cell 161:1592-605

Ozel, A Bilge; Moroi, Sayoko E; Reed, David M et al. (2014) Genome-wide association study and meta-analysis of intraocular pressure. Hum Genet 133:41-57

Showing the most recent 10 out of 24 publications

Comments

Be the first to comment on Manolis Kellis's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: