Genome Wide Association Studies (GWAS) have uncovered an unprecedented number of variants associated with important health-related traits and diseases. Evidence from these studies suggests that most clinically relevant traits have complex genetic architectures. Whole Genome Prediction (WGP) is a predictive approach, primarily developed and tested in the field of animal breeding, designed to confront some of the challenges emerging in the prediction of complex traits and diseases. Implementing WGP requires specialized software, which is not available in standard statistical packages. In our research projects involving plant, animal and more recently human data, we have developed, tested and used statistical software for parametric and semi-parametric WGP. In this project we propose to integrate and further develop this software in ways that will improve its value for applications with human data. We will integrate parametric and semi-parametric procedures for WGP into a unified framework and will deliver software that could be used with un-censored, censored, binary and ordinal traits. The software produced in this project will be delivered as an R-package and will be integrated into GenePattern;a bioinformatics platform where users will be able to develop analysis pipelines by combining our software with other bioinformatics tools.

Public Health Relevance

Genome Wide Association Studies (GWAS) have uncovered an unprecedented number of variants associated with important health-related traits and diseases. Evidence from these studies suggests that most clinically relevant traits have complex genetic architectures. Whole Genome Prediction (WGP) is a predictive approach, primarily developed and tested in the field of animal breeding, designed to confront some of the challenges emerging in the prediction of complex traits and diseases. We believe that this methodology offers great opportunities to advance our ability to predict genetic predisposition to complex human traits and diseases. Implementing WGP methods requires specialized software, which is not available in standard statistical packages. In our research we have developed, tested, and used statistical software for parametric and non-parametric WGP. The proposed project will integrate these software into a unified framework, will further develop these packages by implementing additional regression methods, and will extend the software to handle traits often encountered in human applications such as censored, binary and ordinal outcomes. The software developed in this project will be integrated into R and into GenePattern, a bioinformatics workflow platform which will enable users to integrate our software with other bioinformatics tools.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM101219-03
Application #
8607197
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Brazhnik, Paul
Project Start
2012-03-01
Project End
2015-01-31
Budget Start
2014-02-01
Budget End
2015-01-31
Support Year
3
Fiscal Year
2014
Total Cost
$196,127
Indirect Cost
$62,252
Name
University of Alabama Birmingham
Department
Biostatistics & Other Math Sci
Type
Schools of Public Health
DUNS #
063690705
City
Birmingham
State
AL
Country
United States
Zip Code
35294
de los Campos, G; Sorensen, D (2014) On the genomic analysis of data from structured populations. J Anim Breed Genet 131:163-4
PĂ©rez, Paulino; de los Campos, Gustavo (2014) Genome-wide regression and prediction with the BGLR statistical package. Genetics 198:483-95
Jarquin, Diego; Crossa, Jose; Lacaze, Xavier et al. (2014) A reaction norm model for genomic selection using high-dimensional genomic and environmental data. Theor Appl Genet 127:595-607
Klimentidis, Yann C; Wineinger, Nathan E; Vazquez, Ana I et al. (2014) Multiple metabolic genetic risk scores and type 2 diabetes risk in three racial/ethnic groups. J Clin Endocrinol Metab 99:E1814-8
de los Campos, Gustavo; Sorensen, Daniel A (2013) A commentary on Pitfalls of predicting complex traits from SNPs. Nat Rev Genet 14:894
Crossa, Jose; Beyene, Yoseph; Kassa, Semagn et al. (2013) Genomic prediction in maize breeding populations with genotyping-by-sequencing. G3 (Bethesda) 3:1903-26
de Los Campos, Gustavo; Perez, Paulino; Vazquez, Ana I et al. (2013) Genome-enabled prediction using the BLR (Bayesian Linear Regression) R-package. Methods Mol Biol 1019:299-320
de Los Campos, Gustavo; Vazquez, Ana I; Fernando, Rohan et al. (2013) Prediction of complex human traits using the genomic best linear unbiased predictor. PLoS Genet 9:e1003608
de Los Campos, Gustavo; Hickey, John M; Pong-Wong, Ricardo et al. (2013) Whole-genome regression and prediction methods applied to plant and animal breeding. Genetics 193:327-45