This project is about the development of better statistical methods to dissect complex trait variation and to predict outcome from genome-wide marker data. It anticipates that individual risk prediction for disease will become an integral part of Genomic Medicine in the USA and elsewhere. To predict an individual's risk of disease from genetic data it is not necessary to have identified the causal variant or fully understand the biology - all that is needed is a predictor that is correlated with outcome. The statistically best predictor depends on the genetic architecture of the trait: the distribution of effect sizes of causal variants, the distribution of their allele frequency, and the correlation between the two. Therefore, methods to better understand the genetic architecture of complex traits will lead to better statistical prediction methods and the performance of prediction methods will lead to new inference on genetic architecture. We will develop, test and apply statistical genetic methods that utilize whole-genome genotype or sequence data from population based samples that have also been phenotyped for one or more complex traits, estimate locus-specific, chromosome-wide and whole genome matrices of genetic covariance between all pairs of individuals, and estimate variance components associated with these. We will use the results and those from large genomewide association studies to estimate the distribution of SNP and chromosome segment effects by fitting mixture models using an EM-algorithm. We will use simulation models to calibrate the observed distribution of risk allele frequencies for disease with evolutionary models that include the mode of natural selection and pleiotropic relationships in effects on fitness and disease as parameters. We will develop and test Bayesian and non-Bayesian statistical linear mixed models that utilize all available genetic data simultaneously to predict an individual's risk of disease. We will implement prediction methods using data from the Program Grant investigators, from large international research consortia and from data in the public domain, and test their efficiency by correlating outcome with predictors in independent data sets.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Program Projects (P01)
Project #
5P01GM099568-02
Application #
8479391
Study Section
Special Emphasis Panel (ZRG1-GGG-M)
Project Start
Project End
Budget Start
2013-05-01
Budget End
2014-04-30
Support Year
2
Fiscal Year
2013
Total Cost
$133,768
Indirect Cost
Name
University of Washington
Department
Type
DUNS #
605799469
City
Seattle
State
WA
Country
United States
Zip Code
98195
Xue, Angli; Wu, Yang; Zhu, Zhihong et al. (2018) Genome-wide association analyses identify 143 risk variants and putative regulatory mechanisms for type 2 diabetes. Nat Commun 9:2941
Marigorta, Urko M; Rodríguez, Juan Antonio; Gibson, Greg et al. (2018) Replicability and Prediction: Lessons and Challenges from GWAS. Trends Genet 34:504-517
Pappas, D J; Lizee, A; Paunic, V et al. (2018) Significant variation between SNP-based HLA imputations in diverse populations: the last mile is the hardest. Pharmacogenomics J 18:367-376
Mo, Angela; Marigorta, Urko M; Arafat, Dalia et al. (2018) Disease-specific regulation of gene expression in a comparative analysis of juvenile idiopathic arthritis and inflammatory bowel disease. Genome Med 10:48
Qi, Ting; Wu, Yang; Zeng, Jian et al. (2018) Identifying gene targets for brain-related traits using transcriptomic and methylomic data from blood. Nat Commun 9:2282
Yengo, Loic; Visscher, Peter M (2018) Assortative mating on complex traits revisited: Double first cousins and the X-chromosome. Theor Popul Biol 124:51-60
Browning, Sharon R; Browning, Brian L; Daviglus, Martha L et al. (2018) Ancestry-specific recent effective population size in the Americas. PLoS Genet 14:e1007385
Yengo, Loic; Zhu, Zhihong; Wray, Naomi R et al. (2017) Detection and quantification of inbreeding depression for complex traits from SNP data. Proc Natl Acad Sci U S A 114:8602-8607
Weir, Bruce S; Goudet, Jérôme (2017) A Unified Characterization of Population Structure and Relatedness. Genetics 206:2085-2103
Zheng, Xiuwen; Gogarten, Stephanie M; Lawrence, Michael et al. (2017) SeqArray-a storage-efficient high-performance data format for WGS variant calls. Bioinformatics 33:2251-2257

Showing the most recent 10 out of 152 publications