Genome Wide Association Studies (GWAS) have uncovered an unprecedented number of variants associated with important health-related traits and diseases. Evidence from these studies suggests that most clinically relevant traits have complex genetic architectures. Whole Genome Prediction (WGP) is a predictive approach, primarily developed and tested in the field of animal breeding, designed to confront some of the challenges emerging in the prediction of complex traits and diseases. Implementing WGP requires specialized software, which is not available in standard statistical packages. In our research projects involving plant, animal and more recently human data, we have developed, tested and used statistical software for parametric and semi-parametric WGP. In this project we propose to integrate and further develop this software in ways that will improve its value for applications with human data. We will integrate parametric and semi-parametric procedures for WGP into a unified framework and will deliver software that could be used with un-censored, censored, binary and ordinal traits. The software produced in this project will be delivered as an R-package and will be integrated into GenePattern;a bioinformatics platform where users will be able to develop analysis pipelines by combining our software with other bioinformatics tools.

Public Health Relevance

Genome Wide Association Studies (GWAS) have uncovered an unprecedented number of variants associated with important health-related traits and diseases. Evidence from these studies suggests that most clinically relevant traits have complex genetic architectures. Whole Genome Prediction (WGP) is a predictive approach, primarily developed and tested in the field of animal breeding, designed to confront some of the challenges emerging in the prediction of complex traits and diseases. We believe that this methodology offers great opportunities to advance our ability to predict genetic predisposition to complex human traits and diseases. Implementing WGP methods requires specialized software, which is not available in standard statistical packages. In our research we have developed, tested, and used statistical software for parametric and non-parametric WGP. The proposed project will integrate these software into a unified framework, will further develop these packages by implementing additional regression methods, and will extend the software to handle traits often encountered in human applications such as censored, binary and ordinal outcomes. The software developed in this project will be integrated into R and into GenePattern, a bioinformatics workflow platform which will enable users to integrate our software with other bioinformatics tools.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM101219-03
Application #
8607197
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Brazhnik, Paul
Project Start
2012-03-01
Project End
2015-01-31
Budget Start
2014-02-01
Budget End
2015-01-31
Support Year
3
Fiscal Year
2014
Total Cost
$196,127
Indirect Cost
$62,252
Name
University of Alabama Birmingham
Department
Biostatistics & Other Math Sci
Type
Schools of Public Health
DUNS #
063690705
City
Birmingham
State
AL
Country
United States
Zip Code
35294
Vazquez, Ana I; Veturi, Yogasudha; Behring, Michael et al. (2016) Increased Proportion of Variance Explained and Prediction Accuracy of Survival of Breast Cancer Patients with Use of Whole-Genome Multiomic Profiles. Genetics 203:1425-38
Lopez-Cruz, Marco; Crossa, Jose; Bonnett, David et al. (2015) Increased prediction accuracy in wheat breeding trials using a marker × environment interaction genomic selection model. G3 (Bethesda) 5:569-82
Gianola, Daniel; de los Campos, Gustavo; Toro, Miguel A et al. (2015) Do Molecular Markers Inform About Pleiotropy? Genetics 201:23-9
Lian, Lian; de Los Campos, Gustavo (2015) FW: An R Package for Finlay-Wilkinson Regression that Incorporates Genomic/Pedigree Information and Covariance Structures Between Environments. G3 (Bethesda) 6:589-97
Lehermeier, Christina; Schön, Chris-Carolin; de Los Campos, Gustavo (2015) Assessment of Genetic Heterogeneity in Structured Plant Populations Using Multivariate Whole-Genome Regression Models. Genetics 201:323-37
Ferragina, A; de los Campos, G; Vazquez, A I et al. (2015) Bayesian regression models outperform partial least squares methods for predicting milk components and technological properties using infrared spectral data. J Dairy Sci 98:8133-51
Lebrón-Aldea, Dayanara; Dhurandhar, Emily J; Pérez-Rodríguez, Paulino et al. (2015) Integrated genomic and BMI analysis for type 2 diabetes risk assessment. Front Genet 6:75
Vazquez, Ana I; Klimentidis, Yann C; Dhurandhar, Emily J et al. (2015) Assessment of whole-genome regression for type II diabetes. PLoS One 10:e0123818
de Los Campos, Gustavo; Veturi, Yogasudha; Vazquez, Ana I et al. (2015) Incorporating Genetic Heterogeneity in Whole-Genome Regressions Using Interactions. J Agric Biol Environ Stat 20:467-490
Sørensen, Peter; de los Campos, Gustavo; Morgante, Fabio et al. (2015) Genetic Control of Environmental Variation of Two Quantitative Traits of Drosophila melanogaster Revealed by Whole-Genome Sequencing. Genetics 201:487-97

Showing the most recent 10 out of 27 publications