Modeling the Dynamics of Genome-Scale Data Across Trees

Kostka, Dennis; Capra, John

Abstract

The human genome encodes the developmental programs that result in the creation and maintenance of a complex organism with hundreds of tissues and trillions of cells. Precise, cell type specific control of gene expression is crucial t these processes. Transcription factor (TF) binding, DNA methylation, and histone modifications at gene regulatory elements play key roles in regulating RNA expression. However, the interplay be- tween these factors is poorly understood for most human tissues, and disruption of these processes can cause birth defects, cancer, and other disease. Recent advances in experimental technology have resulted in the production of thousands of genome- wide profiles of DNA methylation, histone modifications, and TF binding across hundreds of cellular contexts. These data hold the promise of revealing the dynamic genomic changes that drive proper development, but sound statistical and computational methods for integrating and testing hypotheses about these large, complex, and highly interdependent data are needed. Different cellular contexts are related through their differentiation histories, and the goal of this projectis to develop analysis tools that leverage these dependencies be- tween developmentally related cell types. This will facilitate the identification of significant changes in DNA and chromatin modifications within developing lineages, and it will highlight when and how these modifications impact gene expression. The approaches developed in this project will enable researchers to address the following biomedically important questions: Which DNA and chromatin modifications drive different transitions in a cellular differentiation? Which genomic regions are influenced by these modifications? What genes are influenced by these dynamic regulatory modifications in different lineages? Software will be developed, tested, and validated on several recent detailed characterizations of blood cell differentiation. This work will provide the developmental and cancer biology communities with open-source tools for characterizing the genomic basis of normal and abnormal development. In addition, with erroneous patterns of DNA methylation and histone modification now being used as diagnostic hallmarks for specific cancers, given the right data, this framework may open up avenues towards a better understanding of the biological underpinnings of such biomarkers.

Public Health Relevance

Different regions of the human genome are active in different types of cells, and improper activation of specific regions is often the cause of developmental disorders, cancer, and other disease. Recent dramatic improvements in experimental techniques have led to the collection of thousands of genome-wide activity patterns, but statistical methods to coherently model and test hypotheses about these data still need to be developed. The research proposed in this project will model genome-wide activity profiles in a statistical framework that accounts for interactions and dependencies in profiles from related cell types; the implementation of these models in open-source software will enable researchers to characterize shifts in genomic activity that are associated with the creation of healthy cells, and identify how genomic regulation goes awry in disease.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Research Project (R01)
Project #: 5R01GM115836-03
Application #: 9306885
Study Section: Biodata Management and Analysis Study Section (BDMA)
Program Officer: Ravichandran, Veerasamy

Project Start: 2015-08-01
Project End: 2019-06-30
Budget Start: 2017-07-01
Budget End: 2019-06-30
Support Year: 3
Fiscal Year: 2017
Total Cost
Indirect Cost

Institution

Name: University of Pittsburgh
Department: Anatomy/Cell Biology
Type: Schools of Medicine
DUNS #: 004514360

City: Pittsburgh
State: PA
Country: United States
Zip Code: 15213

Related projects


NIH 2017 R01 GM	Modeling the Dynamics of Genome-Scale Data Across Trees Kostka, Dennis; Capra, John Anthony / University of Pittsburgh
NIH 2016 R01 GM	Modeling the Dynamics of Genome-Scale Data Across Trees Kostka, Dennis; Capra, John Anthony / University of Pittsburgh
NIH 2015 R01 GM	Modeling the Dynamics of Genome-Scale Data Across Trees Kostka, Dennis; Capra, John Anthony / University of Pittsburgh	$361,934

Publications

Pouyan, Maziyar Baran; Kostka, Dennis (2018) Random forest based similarity learning for single cell RNA sequencing data. Bioinformatics 34:i79-i88

Kostka, Dennis; Holloway, Alisha K; Pollard, Katherine S (2018) Developmental Loci Harbor Clusters of Accelerated Regions That Evolved Independently in Ape Lineages. Mol Biol Evol 35:2034-2045

Phua, Yu Leng; Clugston, Andrew; Chen, Kevin Hong et al. (2018) Small non-coding RNA expression in mouse nephrogenic mesenchymal progenitors. Sci Data 5:180218

Simonti, Corinne N; Pavlicev, Mihaela; Capra, John A (2017) Transposable Element Exaptation into Regulatory Regions Is Rare, Influenced by Evolutionary Age, and Subject to Pleiotropic Constraints. Mol Biol Evol 34:2856-2869

Colbran, Laura L; Chen, Ling; Capra, John A (2017) Short DNA sequence patterns accurately identify broadly active human enhancers. BMC Genomics 18:536

Comments

Be the first to comment on Dennis Kostka's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: