Genome-wide association studies (GWAS) have become the primary approach for dissecting the genetic basis of complex diseases and are a powerful approach for detecting common alleles that influence disease risk. To date, hundreds of putative disease gene loci have been identified in GWAS. Despite this progress, these newly discovered loci typically account for only a small fraction of disease heritability. This raises new questions about where and how we can find the remaining genetic variation contributing to the susceptibility of complex and common diseases. Potential sources of missing heritability are (1) the contribution of rare variants, (2) gene-gene and gene-environment interaction, (3) combination of multiple SNPs, each with small genetic effect, but collectively conferring large risk, (4) structural variation. Current statistical methods for genetic analysis are well suited for detecting common variants, but new models and methods of analysis are needed for revealing the sources of missing disease heritability. To this end, the goals of this proposal are to develop novel and powerful statistical methods for studying rare variants and gene-gene interactions in the context of next-generation sequencing and GWAS data. Specifically, the methods we will develop will provide a unified analytical framework for testing associations with both common and rare alleles as well as their interaction with genetic and environmental factors. We will also develop graphical models and other statistical methods for co-association and interaction network analysis. The power of these methods will be rigorously analyzed by theoretical and simulation approaches, and will be applied to existing GWAS data sets (psoriasis and rheumatoid arthritis) and next generation sequencing data of extreme cardiovascular phenotypes funded by NIH grant 1RC2 HL02419-01.
This project aims to develop novel and powerful statistical methods for genetic association and interaction analysis of next-generation sequencing data and finding missing heritability unexplained by the current GWAS. Application of these methods to the sequence data will facilitate to identify entire spectrum of genetic variations that influence diseases and provide potential valuable tools for the development of diagnostic and interventional strategies.
Showing the most recent 10 out of 36 publications