The completion of the human genome project brought the promise of genomic medicine-the use of genomic information for prevention, diagnosis and treatment of diseases. Yet, despite great progress in genotyping technologies, our ability to predict genetic predisposition to complex human traits and diseases remains very limited. Part of the explanation of our paradoxical lack of ability to predict complex human traits may reside in the limitations posed by the statistical methods commonly used in genome wide association studies. We believe that alternative methods, largely adapted from the field of animal breeding (WGP, whole-genome prediction), can enhance our ability to predict complex human traits and diseases, thus paving the way towards more intensive use of genomic information in personalized medicine. However, the populations to which WGP has been successfully applied differ greatly from human populations in aspects such as selection history, distribution of allele frequency, extent linkage disequilibrium (LD) and inbreeding. And preliminary evidence indicates that these factors can impact the predictive performance of WGP. Therefore, a comprehensive evaluation of WGP with human data is needed, and new methods may need to be developed to cope with the challenges posed by the prediction of complex human traits. We propose a framework to study the factors affecting the ability of WGP to account for and to predict variance at un-observed QTL. Using this framework, and a combination of simulation and real data analysis, we will produce the first comprehensive evaluation of existing WGP with human data and will quantify the effects of key features of the data, of the trait of interest, and of the regression method on the prediction accuracy of existing WGP procedures. We will use this information to develop new methods designed to confront the limitations of existing ones.

Public Health Relevance

Project Narrative Despite great progress in genotyping technologies our ability to predict genetic risk remains very limited. We believe that alternative statistical methods (WGP, Whole Genome Prediction) largely adapted from animal breeding, may offer opportunities for advancing our ability to predict important health outcomes. However, human populations differ from animal and plant breeding populations in aspects that can greatly affect the predictive performance of WGP. Using a combination of simulations and real-data analysis we will: (a) produce a comprehensive evaluation of existing WGP with human data, (b) quantify the effects of key features of the data, of the trait of interest, and of the regression method on the prediction accuracy of WGP, and (c) develop new regression procedures designed to confront the limitations identified in existing ones.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Genetic Variation and Evolution Study Section (GVE)
Program Officer
Krasnewich, Donna M
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Alabama Birmingham
Biostatistics & Other Math Sci
Schools of Public Health
United States
Zip Code
Sun, Xiaochen; Fernando, Rohan; Dekkers, Jack (2016) Contributions of linkage disequilibrium and co-segregation information to the accuracy of genomic prediction. Genet Sel Evol 48:77
Vazquez, Ana I; Veturi, Yogasudha; Behring, Michael et al. (2016) Increased Proportion of Variance Explained and Prediction Accuracy of Survival of Breast Cancer Patients with Use of Whole-Genome Multiomic Profiles. Genetics 203:1425-38
Lopez-Cruz, Marco; Crossa, Jose; Bonnett, David et al. (2015) Increased prediction accuracy in wheat breeding trials using a marker × environment interaction genomic selection model. G3 (Bethesda) 5:569-82
Gianola, Daniel; de los Campos, Gustavo; Toro, Miguel A et al. (2015) Do Molecular Markers Inform About Pleiotropy? Genetics 201:23-9
Lian, Lian; de Los Campos, Gustavo (2015) FW: An R Package for Finlay-Wilkinson Regression that Incorporates Genomic/Pedigree Information and Covariance Structures Between Environments. G3 (Bethesda) 6:589-97
Lehermeier, Christina; Schön, Chris-Carolin; de Los Campos, Gustavo (2015) Assessment of Genetic Heterogeneity in Structured Plant Populations Using Multivariate Whole-Genome Regression Models. Genetics 201:323-37
Ferragina, A; de los Campos, G; Vazquez, A I et al. (2015) Bayesian regression models outperform partial least squares methods for predicting milk components and technological properties using infrared spectral data. J Dairy Sci 98:8133-51
Lebrón-Aldea, Dayanara; Dhurandhar, Emily J; Pérez-Rodríguez, Paulino et al. (2015) Integrated genomic and BMI analysis for type 2 diabetes risk assessment. Front Genet 6:75
Vazquez, Ana I; Klimentidis, Yann C; Dhurandhar, Emily J et al. (2015) Assessment of whole-genome regression for type II diabetes. PLoS One 10:e0123818
de Los Campos, Gustavo; Veturi, Yogasudha; Vazquez, Ana I et al. (2015) Incorporating Genetic Heterogeneity in Whole-Genome Regressions Using Interactions. J Agric Biol Environ Stat 20:467-490

Showing the most recent 10 out of 30 publications