The coming NHGRI Centers for Common Disease Genomics (CCDG) and Centers for Mendelian Genomics (CMG) plan to generate whole genome sequencing (WGS) data on over 200,000 individuals. WGS will provide comprehensive and complete genetic data across coding and non-coding variation, presenting an unprecedented opportunity for discovery in the genetic analysis of human diseases. However, a lack of powerful analytic tools that fully realize the potential of these data has emerged as a bottleneck for effectively translating rich information contained in these massive WGS data into meaningful insights about human diseases. There is a pressing need to develop powerful and robust analytic methods for WGS that can accelerate genetic discoveries. To meet this need, we have assembled an interdisciplinary team of computational biologists, geneticists, and statisticians. Building on our extensive track record in sequencing studies, statistical genetics, functional analysis and computational biology, we will power the next round of genetic discoveries by (1) building a massive WGS control sample and developing the methods for incorporating these controls in studies of complex and Mendelian diseases; (2) creating more powerful statistical methods for rare variant analysis through the incorporation of functional and regulatory information and advanced statistical tools; (3) establishing methods to analyze multiple phenotypes to boost the power for association and understand how different phenotypes relate genetically. These methods will enhance our ability to identify novel associations across a wide range of genetic architectures, from Mendelian diseases driven by a strong acting allele to complex polygenic traits. Novel associations promise to lay the foundation for gaining new insight into the biological mechanisms driving disease and be the bedrock for precision prevention and medicine strategies. We will collaborate with the investigators of the Genome Sequencing Program, and will share the developed data resources, tools and methods with the community through user-friendly open source software and educational modules.

Public Health Relevance

Statistical and computational methods, as well as shared data and functional annotation resources, play a pivotal role in genetic analysis of human diseases using Whole Genome Sequencing (WGS) data. They will enable researchers to timely and effectively extract knowledge from massive WGS data and complex and diverse phenotype data, and to gain insights in disease etiology, risk and prognosis, and lay the foundation for developing new strategies to reduce disease burden and improving disease prevention and patient care strategies.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project--Cooperative Agreements (U01)
Project #
3U01HG009088-04S3
Application #
10168752
Study Section
Program Officer
Felsenfeld, Adam
Project Start
2020-07-31
Project End
2021-03-31
Budget Start
2020-07-31
Budget End
2021-03-31
Support Year
4
Fiscal Year
2020
Total Cost
Indirect Cost
Name
Harvard University
Department
Biostatistics & Other Math Sci
Type
Schools of Public Health
DUNS #
149617367
City
Boston
State
MA
Country
United States
Zip Code
02115
Gazal, Steven; Loh, Po-Ru; Finucane, Hilary K et al. (2018) Functional architecture of low-frequency variants highlights strength of negative selection across coding and non-coding annotations. Nat Genet 50:1600-1607
Slowikowski, Kamil; Wei, Kevin; Brenner, Michael B et al. (2018) Functional genomics of stromal cells in chronic inflammatory diseases. Curr Opin Rheumatol 30:65-71
Liu, Zhonghua; Lin, Xihong (2018) Multiple phenotype association tests using summary statistics in genome-wide association studies. Biometrics 74:165-175
Verbanck, Marie; Chen, Chia-Yen; Neale, Benjamin et al. (2018) Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat Genet 50:693-698
Sun, Ryan; Carroll, Raymond J; Christiani, David C et al. (2018) Testing for gene-environment interaction under exposure misspecification. Biometrics 74:653-662
Li, Heng; Bloom, Jonathan M; Farjoun, Yossi et al. (2018) A synthetic-diploid benchmark for accurate variant-calling evaluation. Nat Methods 15:595-597
He, Liang; Zhbannikov, Ilya; Arbeev, Konstantin G et al. (2017) A genetic stochastic process model for genome-wide joint analysis of biomarker dynamics and disease susceptibility with longitudinal data. Genet Epidemiol 41:620-635
Aschard, Hugues; Guillemot, Vincent; Vilhjalmsson, Bjarni et al. (2017) Covariate selection for association screening in multiphenotype genetic studies. Nat Genet 49:1789-1795
Sohail, Mashaal; Vakhrusheva, Olga A; Sul, Jae Hoon et al. (2017) Negative selection in humans and fruit flies involves synergistic epistasis. Science 356:539-542
McAllister, Kimberly; Mechanic, Leah E; Amos, Christopher et al. (2017) Current Challenges and New Opportunities for Gene-Environment Interaction Studies of Complex Diseases. Am J Epidemiol 186:753-761

Showing the most recent 10 out of 21 publications