Massively parallel sequencing has transformed the field of genomic studies. These new technologies have resulted in the successful identification of causal variants for several rare Mendelian disorders. They also hold the promise to help explain some of the missing heritability from genomewide association studies of complex traits. However, the development of robust statistical and computational methods has fallen seriously behind the technological advances particularly for application to the study of complex human traits. The methodological work lags in at least three major areas. First, there are few, if any, publications on the optimal design of sequencing-based studies for complex traits that take into account the complex dynamic of sequencing cost to allow for exploration of the full range sample size and sequencing depth. Second, there are no published methods for the analysis of low coverage (in the range of 2-4X) sequencing data. Low coverage sequencing is being used to study complex diseases and traits because it can lead to substantial gains in power by increasing the effective sample size, critical for the detection of moderate genetic effects for typical complex human traits. Third, the field needs statistical methods that can efficiently analyze rare variants derived from various designs of sequencing-based studies. In this application, we will establish a comprehensive statistical framework for the design and analysis of sequencing-based studies for complex human traits. To do so, we propose the following four specific aims: 1) Develop a unified statistical framework for SNP calling, genotyping, and haplotyping from sequencing and genotyping data. 2) Provide alternative design options for sequencing-based genetic studies. 3) Develop statistical methods for the analysis of rare variants. 4) Develop, distribute and support freely available software packages for the methods proposed in this application. The proposed methods will be evaluated through analytical approaches, computer simulations and applications to multiple real datasets.

Public Health Relevance

Massively parallel sequencing has transformed the field of genomic studies. These new technologies have resulted in the successful identification of causal variants for several rare Mendelian disorders and hold the promise to help explain some of the missing heritability from genomewide association studies of complex traits. However, the development of robust statistical and computational methods has fallen seriously behind the technological advances particularly for application to the study of complex human traits. In this application, we will establish a comprehensive statistical framework for the design and analysis of sequencing-based studies for complex human traits.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG006292-04
Application #
8666560
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Brooks, Lisa
Project Start
2011-08-23
Project End
2016-05-31
Budget Start
2014-06-01
Budget End
2015-05-31
Support Year
4
Fiscal Year
2014
Total Cost
$359,607
Indirect Cost
$114,607
Name
University of North Carolina Chapel Hill
Department
Genetics
Type
Schools of Medicine
DUNS #
608195277
City
Chapel Hill
State
NC
Country
United States
Zip Code
27599
Fan, Ruzong; Wang, Yifan; Chiu, Chi-Yang et al. (2016) Meta-analysis of Complex Diseases at Gene Level with Generalized Functional Linear Models. Genetics 202:457-70
He, Qianchuan; Cai, Tianxi; Liu, Yang et al. (2016) Prioritizing individual genetic variants after kernel machine testing using variable selection. Genet Epidemiol 40:722-731
Zhang, Guosheng; Huang, Kuan-Chieh; Xu, Zheng et al. (2016) Across-Platform Imputation of DNA Methylation Levels Incorporating Nonlocal Information Using Penalized Functional Regression. Genet Epidemiol 40:333-40
Xu, Zheng; Zhang, Guosheng; Duan, Qing et al. (2016) HiView: an integrative genome browser to leverage Hi-C results for the interpretation of GWAS variants. BMC Res Notes 9:159
Lange, Ethan M; Ribado, Jessica V; Zuhlke, Kimberly A et al. (2016) Assessing the Cumulative Contribution of New and Established Common Genetic Risk Factors to Early-Onset Prostate Cancer. Cancer Epidemiol Biomarkers Prev 25:766-72
Abdo, Nour; Xia, Menghang; Brown, Chad C et al. (2015) Population-based in vitro hazard and concentration-response assessment of chemicals: the 1000 genomes high-throughput screening study. Environ Health Perspect 123:458-66
Hu, Yi-Juan; Li, Yun; Auer, Paul L et al. (2015) Integrative analysis of sequencing and array genotype data for discovering disease associations with rare mutations. Proc Natl Acad Sci U S A 112:1019-24
Wang, Xuexia; Zhang, Shuanglin; Li, Yun et al. (2015) A powerful approach to test an optimally weighted combination of rare variants in admixed populations. Genet Epidemiol 39:294-305
Bi, Wenjian; Kang, Guolian; Zhao, Yanlong et al. (2015) SVSI: fast and powerful set-valued system identification approach to identifying rare variants in sequencing studies for ordered categorical traits. Ann Hum Genet 79:294-309
Urrutia, Eugene; Lee, Seunggeun; Maity, Arnab et al. (2015) Rare variant testing across methods and thresholds using the multi-kernel sequence kernel association test (MK-SKAT). Stat Interface 8:495-505

Showing the most recent 10 out of 37 publications