The rapid progress in genotyping technology has greatly facilitated our understanding of the genetic aspect of various diseases. Several genome-wide association studies (GWAS) have been published on various complex diseases, where genotype data on a large number of single nucleotide polymorphisms (SNPs) are collected to study the association between these SNPs and a disease. Most of these GWAS are limited to single SNP association analyses. New loci are found to be associated with different diseases in these GWAS, but generally they explain very little of the genetic risk for these diseases. As GWAS are still underpowered to find small main effects, and gene-gene interactions are likely to play a role, the data might currently not be analyzed to its full potential. In this proposal, we aim to evaluate alternative methods to study GWAS data. We investigate the pathway-based approaches that incorporate the prior biological information into the analysis and try to detect the effect of a pathway (a collection of single nucleotide polymorphisms (SNPs) with biological relevance) on a disease, instead of focusing on the effects of individual SNPs. Current pathway-based methods are mostly limited to investigating the overrepresentation of pathways in a GWAS, which mainly focus on the individual effects of the SNPs within a pathway. These methods try to avoid the estimation of a large number of parameters involved in the joint modeling of effects of a large number of SNPs within a pathway. Hence, most of these methods do not take into account the possibility of the interaction among multiple SNPs. Moreover, none of these approaches is likelihood-based. It is hence not possible to estimate and quantify the overall pathway effect on disease risk and assess its statistical uncertainty.
In Aim 1, this proposal offers a collection of novel statistical methods as well as a suite of user-friendly software to study the joint effects of a group of SNPs within a pathway on a complex multifactorial disease, incorporating the possibility of interaction among the SNPs. The model also offers a data reduction strategy that avoids the issues associated with the estimation of a large number of parameter corresponding to a large number of SNPs within a pathway. We also propose to extensively compare the different existing approaches on pathway- based analysis through simulation studies, which would provide a better understanding of the advantages and limitations of these methods.
In Aim 2, we investigate the effects of pathways on type 2 diabetes and related quantitative traits in the ARIC population. We will derive multiple pathways related to type 2 diabetes and will use our model as well as other existing approaches to compare the effects of these pathways. Our proposed pathway-based GWAS may unravel new SNPs or pathways associated with type 2 diabetes and these quantitative traits and thus facilitate to gain insight into the deep understanding of intricate networks of functionally related genes in type 2 diabetes. Finally, in Aim 3, we aim to provide a software to conduct pathway-based analyses using our proposed approach. The availability of the software would help the genetic epidemiologists to carry out more sophisticated pathway-based analyses, and in turn lead to further research and development of statistical methods for pathway analyses. The broader impact of our work lies in its ability to improve the understanding of the professionals engaged in unraveling the complex genetic architecture for complex disease. There is growing evidence that gene- gene and gene-environment interactions contribute to complex diseases rather than single genes. Instead of focusing only on the SNPs with the highest statistical significance, our approach and software will provide an alternative way of analyzing the GWAS data. Our findings will facilitate and improve the understanding of complex mechanisms of functionally related genes, thereby having far reaching beneficial effects on the diagnosis and treatment of complex diseases.

Public Health Relevance

The potential of the methods and software we propose is exceedingly broad, since they will substantially improve the current pathway based genome-wide association studies (GWAS). We envision that our method would facilitate a new paradigm for GWAS, which not only will identify the genes that include significant single nucleotide polymorphisms (SNPs) found by single SNP analysis, but will also detect new genes in which each single SNP confer small disease risk, but their joint actions are implicated in the development of diseases. The pathway-based association analysis will improve the understanding of complex mechanisms of function- ally related genes, thereby having far reaching beneficial effects on the diagnosis and treatment of complex diseases.

Agency
National Institute of Health (NIH)
Institute
National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK)
Type
Exploratory/Developmental Grants (R21)
Project #
5R21DK089351-02
Application #
8097238
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Mckeon, Catherine T
Project Start
2010-07-01
Project End
2014-06-30
Budget Start
2011-07-01
Budget End
2014-06-30
Support Year
2
Fiscal Year
2011
Total Cost
$214,495
Indirect Cost
Name
University of Minnesota Twin Cities
Department
Biostatistics & Other Math Sci
Type
Schools of Public Health
DUNS #
555917996
City
Minneapolis
State
MN
Country
United States
Zip Code
55455
Ray, Debashree; Li, Xiang; Pan, Wei et al. (2015) A Bayesian Partitioning Model for the Detection of Multilocus Effects in Case-Control Studies. Hum Hered 79:69-79
Zhang, Yiwei; Guan, Weihua; Pan, Wei (2013) Adjustment for population stratification via principal components in association analysis of rare variants. Genet Epidemiol 37:99-109
Basu, Saonli; Zhang, Yiwei; Ray, Debashree et al. (2013) A rapid gene-based genome-wide association test with multivariate traits. Hum Hered 76:53-63
Han, Fang; Pan, Wei (2012) A composite likelihood approach to latent multivariate Gaussian modeling of SNP data with application to genetic association testing. Biometrics 68:307-15
Basu, Saonli; Pan, Wei; Shen, Xiaotong et al. (2011) Multilocus association testing with penalized regression. Genet Epidemiol 35:755-65
Pan, Wei; Basu, Saonli; Shen, Xiaotong (2011) Adaptive tests for detecting gene-gene and gene-environment interactions. Hum Hered 72:98-109
Basu, Saonli; Pan, Wei; Oetting, William S (2011) A dimension reduction approach for modeling multi-locus interaction in case-control studies. Hum Hered 71:234-45
Basu, Saonli; Pan, Wei (2011) Comparison of statistical tests for disease association with rare variants. Genet Epidemiol 35:606-19
Pan, Wei; Shen, Xiaotong (2011) Adaptive tests for association analysis of rare variants. Genet Epidemiol 35:381-8
Pan, Wei (2011) Relationship between genomic distance-based regression and kernel machine regression for multi-marker association testing. Genet Epidemiol 35:211-6