The NHLBI Trans-Omics for Precision Medicine (TOPMed) program aims to provide high- priority studies of heart, lung, blood and sleep disorders (HLBS) with high-quality genomic data. This year, the program will deeply sequence >60,000 genomes to characterize DNA sequence variation at scale. It is expected that >400 million genetic variants will be identified. In later phases, it is expected that rich genomic assays will be applied to an equally large number of samples. In a pilot phase, these additional assays will include ~3,000 transcriptomes, ~2,000 methylation profiles, and ~2,000 metabolomics profiles. Data on this scale opens up many opportunities for discovery and analysis but also poses significant challenges. RFA-HL-17-011, entitled ?NHLBI TOPMed Program: Integrative Omics Approaches for Analysis of TOPMed Data (U01)? is intended to stimulate development of computational and statistical methods and tools that enable innovative and scalable analyses genomic resource. Our group has a long history in the development of specialized, state-of-the- art methods and tools for the processing and analysis of large genomic datasets. We have a history of leadership in varied resources, ranging from the Mouse HapMap Project, to 1000 Genomes Project, to ENCODE, and including the NHLBI?s TOPMed program. In this application, we propose to develop innovative and practical methods to enable informative genomic analysis at scale. These methods encompass computational tools to rapidly scale deep GWAS, statistical methods for robust and powerful integrative omics analysis, and visualization methods for integrative interpretation of omics genetics results. We will implement these methods into cost-effective, easy-to-use, and well-documented software packages that facilitate understanding of molecular mechanisms involved in HLBS disorders. A key component of the proposal is the deployment of these tools on commercial clouds, providing accessible interface to investigators without direct access to a local high-throughput compute and data storage facility. The resulting tools will empower a wide range of scientists to run best-in-class methods to accelerate discovery of new treatments for HLBS disorders.

Public Health Relevance

In the next few years, NHLBI Trans-Omics for Precision Medicine (TOPMed) program will sequence >100,000 genomes and tens of thousands omics profiles to unravel the etiology of heart, lung, blood and sleep disorders. Due to the unprecedented amount of genomic data to be generated, it is crucial to develop a computational framework to reduce computational burden for individual investigators to analyze genetic basis of a trait. In this proposal, we not only propose developing novel methods and tools for integrative omics analysis, but also lays out specific plans to accomplish systematized collaboration via automated tools and interactive interface. Through the scalable computational tools and interactive graphical interface, we anticipate to save invaluable time for numerous TOPMed investigators to focus on scientific interpretation instead of being distracted by the computational burden. As a result, we expect that these methods and tools will lead directly to improved understanding of the molecular basis of many human diseases, as an important step in the path towards new treatments and therapies.

Agency
National Institute of Health (NIH)
Institute
National Heart, Lung, and Blood Institute (NHLBI)
Type
Research Project--Cooperative Agreements (U01)
Project #
5U01HL137182-03
Application #
9675325
Study Section
Special Emphasis Panel (ZHL1)
Program Officer
Gan, Weiniu
Project Start
2017-04-15
Project End
2021-03-31
Budget Start
2019-04-01
Budget End
2021-03-31
Support Year
3
Fiscal Year
2019
Total Cost
Indirect Cost
Name
University of Michigan Ann Arbor
Department
Biostatistics & Other Math Sci
Type
Schools of Public Health
DUNS #
073133571
City
Ann Arbor
State
MI
Country
United States
Zip Code
48109
Kang, Hyun Min; Subramaniam, Meena; Targ, Sasha et al. (2018) Multiplexed droplet single-cell RNA-sequencing using natural genetic variation. Nat Biotechnol 36:89-94
Quang, Daniel; Guan, Yuanfang; Parker, Stephen C J (2018) YAMDA: thousandfold speedup of EM-based motif discovery using deep learning libraries and GPU. Bioinformatics 34:3578-3580
Regier, Allison A; Farjoun, Yossi; Larson, David E et al. (2018) Functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects. Nat Commun 9:4038