Massively parallel sequencing has transformed the field of genomic studies. These new technologies have resulted in the successful identification of causal variants for several rare Mendelian disorders. They also hold the promise to help explain some of the missing heritability from genomewide association studies of complex traits. However, the development of robust statistical and computational methods has fallen seriously behind the technological advances particularly for application to the study of complex human traits. The methodological work lags in at least three major areas. First, there are few, if any, publications on the optimal design of sequencing-based studies for complex traits that take into account the complex dynamic of sequencing cost to allow for exploration of the full range sample size and sequencing depth. Second, there are no published methods for the analysis of low coverage (in the range of 2-4X) sequencing data. Low coverage sequencing is being used to study complex diseases and traits because it can lead to substantial gains in power by increasing the effective sample size, critical for the detection of moderate genetic effects for typical complex human traits. Third, the field needs statistical methods that can efficiently analyze rare variants derived from various designs of sequencing-based studies. In this application, we will establish a comprehensive statistical framework for the design and analysis of sequencing-based studies for complex human traits. To do so, we propose the following four specific aims: 1) Develop a unified statistical framework for SNP calling, genotyping, and haplotyping from sequencing and genotyping data. 2) Provide alternative design options for sequencing-based genetic studies. 3) Develop statistical methods for the analysis of rare variants. 4) Develop, distribute and support freely available software packages for the methods proposed in this application. The proposed methods will be evaluated through analytical approaches, computer simulations and applications to multiple real datasets.
Massively parallel sequencing has transformed the field of genomic studies. These new technologies have resulted in the successful identification of causal variants for several rare Mendelian disorders and hold the promise to help explain some of the missing heritability from genomewide association studies of complex traits. However, the development of robust statistical and computational methods has fallen seriously behind the technological advances particularly for application to the study of complex human traits. In this application, we will establish a comprehensive statistical framework for the design and analysis of sequencing-based studies for complex human traits.
Showing the most recent 10 out of 52 publications