This project will develop statistical methods for analyzing both genome-wide association studies and studies on multiple candidate genes, where the phenotype of interest is quantitative. The methods will include novel multipoint methods designed to extract the maximum amount of information from the available data, and methods for assessing significance of the results that deal effectively with the large number of multiple comparisons being performed in these large-scale studies. The proposed multipoint approach to association mapping are motivated by the fact that, even with a genome-wide scan of 250,000 SNPs, many SNPs affecting phenotype will be untyped. The idea is to assess whether an untyped SNP affects phenotype by first using surrounding haplotypic variation to predict plausible genotypes at the untyped SNP, and then assessing association between the predicted genotypes and observed phenotypes. The methods for assessing significance will be based on controlling the """"""""False Discovery Rate"""""""" (the proportion of positive findings that turn out to be incorrect). The methods will be applied to a genome-wide scan (250,000 SNPs in 1,000 individuals) and candidate gene studies aimed at identifying genetic variants and genes responsible for differential response to statin drugs, and to data from a candidate gene study aimed at identifying genetic variants affecting quantitative phenotypes associated with atherosclerosis, plaque inflammation, and thrombosis, all factors associated with cardio-vascular disease. Findings from these studies may aid understanding of the genetic factors affecting cardio-vascular disease, and its treatment. In addition, user friendly software implementing the statistical methods will be developed and distributed, allowing other researchers conducting similar studies world-wide to have access to these tools. These tools have the potential to improve the effectiveness and efficiency of studies aimed at determining the underlying genetic basis of common diseases, potentially leading to new treatment strategies for maintaining health and preventing disease. Public health relevance: This project will generate statistical tools for analyzing large-scale studies that aim to help understand the genetic basis of common diseases and drug response. These tools have the potential to improve the effectiveness and efficiency of such studies, potentially leading to new treatment strategies for maintaining health and preventing disease. ? ? ? ?
|Howie, Bryan; Marchini, Jonathan; Stephens, Matthew (2011) Genotype imputation with thousands of genomes. G3 (Bethesda) 1:457-70|
|Barber, Mathew J; Mangravite, Lara M; Hyde, Craig L et al. (2010) Genome-wide association of lipid-lowering response to statins in combined study populations. PLoS One 5:e9763|
|Wen, Xiaoquan; Stephens, Matthew (2010) USING LINEAR PREDICTORS TO IMPUTE ALLELE FREQUENCIES FROM SUMMARY OR POOLED GENOTYPE DATA. Ann Appl Stat 4:1158-1182|
|Stephens, Matthew; Balding, David J (2009) Bayesian statistical methods for genetic association studies. Nat Rev Genet 10:681-90|
|Reiner, Alexander P; Barber, Mathew J; Guan, Yongtao et al. (2008) Polymorphisms of the HNF1A gene encoding hepatocyte nuclear factor-1 alpha are associated with C-reactive protein. Am J Hum Genet 82:1193-201|
|Guan, Yongtao; Stephens, Matthew (2008) Practical issues in imputation-based association mapping. PLoS Genet 4:e1000279|