Massively parallel sequencing has transformed the field of genomic studies. These new technologies have resulted in the successful identification of causal variants for several rare Mendelian disorders. They also hold the promise to help explain some of the missing heritability from genomewide association studies of complex traits. However, the development of robust statistical and computational methods has fallen seriously behind the technological advances particularly for application to the study of complex human traits. The methodological work lags in at least three major areas. First, there are few, if any, publications on the optimal design of sequencing-based studies for complex traits that take into account the complex dynamic of sequencing cost to allow for exploration of the full range sample size and sequencing depth. Second, there are no published methods for the analysis of low coverage (in the range of 2-4X) sequencing data. Low coverage sequencing is being used to study complex diseases and traits because it can lead to substantial gains in power by increasing the effective sample size, critical for the detection of moderate genetic effects for typical complex human traits. Third, the field needs statistical methods that can efficiently analyze rare variants derived from various designs of sequencing-based studies. In this application, we will establish a comprehensive statistical framework for the design and analysis of sequencing-based studies for complex human traits. To do so, we propose the following four specific aims: 1) Develop a unified statistical framework for SNP calling, genotyping, and haplotyping from sequencing and genotyping data. 2) Provide alternative design options for sequencing-based genetic studies. 3) Develop statistical methods for the analysis of rare variants. 4) Develop, distribute and support freely available software packages for the methods proposed in this application. The proposed methods will be evaluated through analytical approaches, computer simulations and applications to multiple real datasets.

Public Health Relevance

Massively parallel sequencing has transformed the field of genomic studies. These new technologies have resulted in the successful identification of causal variants for several rare Mendelian disorders and hold the promise to help explain some of the missing heritability from genomewide association studies of complex traits. However, the development of robust statistical and computational methods has fallen seriously behind the technological advances particularly for application to the study of complex human traits. In this application, we will establish a comprehensive statistical framework for the design and analysis of sequencing-based studies for complex human traits.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG006292-04
Application #
8666560
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Brooks, Lisa
Project Start
2011-08-23
Project End
2016-05-31
Budget Start
2014-06-01
Budget End
2015-05-31
Support Year
4
Fiscal Year
2014
Total Cost
$359,607
Indirect Cost
$114,607
Name
University of North Carolina Chapel Hill
Department
Genetics
Type
Schools of Medicine
DUNS #
608195277
City
Chapel Hill
State
NC
Country
United States
Zip Code
27599
Wang, Yanli; Song, Fan; Zhang, Bo et al. (2018) The 3D Genome Browser: a web-based browser for visualizing 3D genome organization and long-range chromatin interactions. Genome Biol 19:151
Liu, C; Marioni, R E; Hedman, Å K et al. (2018) A DNA methylation biomarker of alcohol consumption. Mol Psychiatry 23:422-433
Duan, Qing; Xu, Zheng; Raffield, Laura M et al. (2018) A robust and powerful two-step testing procedure for local ancestry adjusted allelic association analysis in admixed populations. Genet Epidemiol 42:288-302
Luo, Yiwen; Maity, Arnab; Wu, Michael C et al. (2018) On the substructure controls in rare variant analysis: Principal components or variance components? Genet Epidemiol 42:276-287
Martin, Joshua S; Xu, Zheng; Reiner, Alex P et al. (2017) HUGIn: Hi-C Unifying Genomic Interrogator. Bioinformatics 33:3793-3795
Kerr, Kathleen F; Avery, Christy L; Lin, Henry J et al. (2017) Genome-wide association study of heart rate and its variability in Hispanic/Latino cohorts. Heart Rhythm 14:1675-1684
Du, Yonghong; Martin, Joshua S; McGee, John et al. (2017) A SNP panel and online tool for checking genotype concordance through comparing QR codes. PLoS One 12:e0182438
Cannon, Maren E; Duan, Qing; Wu, Ying et al. (2017) Trans-ancestry Fine Mapping and Molecular Assays Identify Regulatory Variants at the ANGPTL8 HDL-C GWAS Locus. G3 (Bethesda) 7:3217-3227
Graff, Mariaelisa; Emery, Leslie S; Justice, Anne E et al. (2017) Genetic architecture of lipid traits in the Hispanic community health study/study of Latinos. Lipids Health Dis 16:200
Raffield, Laura M; Zakai, Neil A; Duan, Qing et al. (2017) D-Dimer in African Americans: Whole Genome Sequence Analysis and Relationship to Cardiovascular Disease Risk in the Jackson Heart Study. Arterioscler Thromb Vasc Biol 37:2220-2227

Showing the most recent 10 out of 52 publications