After the completion of the Human Genome Project, several landmarking consortia have accumulated large amounts of genomic data towards understanding the functions of human genome. The ENCODE project has annotated genome-wide regulatory elements. The Roadmap Epigenomic project has characterized tissue-speci?c variation in epigenetic state. The NIH Common Fund GTEx project has delineated tissue-speci?c gene expression and transcription regulation. The NIH Common Fund 4D Nucleome (4DN) project has revealed dynamic 3D chromatin organization in many cell and tissue types. Each of the aforementioned consortia has generated thousands or even tens of thousands of datasets, and provided different insights regarding human genome at an unprecedent scale and depth. However, the datasets generated from these consortia are isolated in terms of cell types and tissue types covered, how the data are stored, and the resolution of the genomic data. These gaps bring realistic data analysis challenges to biomedical researchers when they use these public datasets jointly in their research ? they need to go through different data portals with heterogeneous processing pipelines, different data formats, and unmatched resolutions.
We aim to develop the most cutting-edge deep learning approaches to impute high-resolution chromatin contact maps, and integrate the high-resolution chromatin contact maps with transcriptional data available from GTEx project and epigenomic data from ENCODE/Roadmap. We plan to share the integrated data on a public web server with a multi-panel interactive visualization genome browser. The integrated data will provide an important resource for understanding of tissue-speci?c genetic variation in the light of the spatial organization of these genomic and epigenomic elements and their functional implications.

Public Health Relevance

The goal of this project is to develop novel computational methods to integrate 4DN datasets with GTEx datasets and ENCODE/Roadmap datasets. The integrated datasets will be critical resource to unveil the mechanisms of the genetic variants identi?ed in genome-wide association studies. The new knowledge gained here could help us understand the genetic basis of many human diseases.

Agency
National Institute of Health (NIH)
Institute
Office of The Director, National Institutes of Health (OD)
Type
Small Research Grants (R03)
Project #
1R03OD030599-01
Application #
10109293
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Resat, Haluk
Project Start
2020-09-15
Project End
2021-08-31
Budget Start
2020-09-15
Budget End
2021-08-31
Support Year
1
Fiscal Year
2020
Total Cost
Indirect Cost
Name
University of Michigan Ann Arbor
Department
Biostatistics & Other Math Sci
Type
Schools of Medicine
DUNS #
073133571
City
Ann Arbor
State
MI
Country
United States
Zip Code
48109