A new mouse resource, the Collaborative Cross (CC) will provide access to the most diverse mouse strains ever created which will more closely reflect the genetic variation in humans. Outbred recombinant inbred intercrosses (RIX) can be generated by producing F1 hybrids of parental CC RI lines to mimic human populations. CC RIX will greatly enhance our ability to understand some of today's most common and complex diseases. The combined information on genotype, expression, and complex phenotypes of RIX will be among the richest ever compiled. The success of the CC project relies heavily on good experimental designs and appropriate statistical analysis, which we address in this proposal. The ultimate goal of the proposal is to provide scientists working on CC mice with a statistical analysis platform, which contains specially designed analytical tools for CC mouse data, ranging from simple univariate analysis to more complicated multivariate and longitudinal data analysis, and highly complex integrated high-dimensional data analysis.
The specific aims of this project are: 1) developing appropriate univariate analysis tools that account for the special relatedness structure of CC RIX samples;2) extending the analysis methods in Aim 1 to more complicated longitudinal and multivariate phenotypes, and to selected phenotypes;3) joint modeling of the relationship between DNA, gene expression, and phenotype, and 4) developing strategies for selecting CC RIX lines for predictive biology and for accurate phenotypic prediction. The proposed project not only addresses common analytical challenges faced by most high-dimensional genome-wide genetic studies, but also identifies unique features of CC projects, such as phenotype selection, and develops novel statistical methods to address these unique features. The performance of the proposed methods will be evaluated by extensive simulation studies with a wide range of simulation setups and genetic models. Software to carry out the specific aims will be developed and implemented in R or C computing environments for public distribution.

Public Health Relevance

A new mouse resource, the Collaborative Cross (CC) will provide access to the most diverse mouse strains ever created which will more closely reflect the genetic variation in humans. They will greatly enhance our ability to understand some of today's most common and complex diseases. The success of the CC project relies heavily on good experimental designs and appropriate statistical analysis, which we address in this proposal. The ultimate goal of the proposal is to provide scientists working on CC mice with a statistical analysis platform, which contains specially designed analytical tools for CC mouse data, ranging from simple univariate analysis to more complicated multivariate and longitudinal data analysis, and highly complex integrated high-dimensional data analysis. The proposed project not only addresses common analytical challenges faced by most high-dimensional genome-wide genetic studies, but also identifies unique features of CC projects, such as phenotype selection and CC line selection for predictive biology, and develops novel statistical methods to address these unique features.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM074175-07
Application #
8538418
Study Section
Genetic Variation and Evolution Study Section (GVE)
Program Officer
Krasnewich, Donna M
Project Start
2006-04-01
Project End
2015-08-31
Budget Start
2013-09-01
Budget End
2014-08-31
Support Year
7
Fiscal Year
2013
Total Cost
$214,945
Indirect Cost
$69,712
Name
University of North Carolina Chapel Hill
Department
Biostatistics & Other Math Sci
Type
Schools of Public Health
DUNS #
608195277
City
Chapel Hill
State
NC
Country
United States
Zip Code
27599
Wright, Fred A; Sullivan, Patrick F; Brooks, Andrew I et al. (2014) Heritability and genomics of gene expression in peripheral blood. Nat Genet 46:430-7
Lee, Seunggeun; Zou, Fei; Wright, Fred A (2014) Convergence of Sample Eigenvalues, Eigenvectors, and Principal Component Scores for Ultra-High Dimensional Data. Biometrika 101:484-490
Rashid, Naim U; Sun, Wei; Ibrahim, Joseph G (2014) Some Statistical Strategies for DAE-seq Data Analysis: Variable Selection and Modeling Dependencies among Observations. J Am Stat Assoc 109:78-94
Ha, Min Jin; Sun, Wei (2014) Partial correlation matrix estimation using ridge penalty followed by thresholding and re-estimation. Biometrics 70:765-73
Zou, Fei; Sun, Wei; Crowley, James J et al. (2014) A novel statistical approach for jointly analyzing RNA-Seq data from F1 reciprocal crosses and inbred lines. Genetics 197:389-99
Szatkiewicz, Jin P; Wang, WeiBo; Sullivan, Patrick F et al. (2013) Improving detection of copy-number variation by simultaneous bias correction and read-depth segmentation. Nucleic Acids Res 41:1519-32
Sun, Wei; Hu, Yijuan (2013) eQTL Mapping Using RNA-seq Data. Stat Biosci 5:198-219
Gong, Yi; Zou, Fei (2012) Varying coefficient models for mapping quantitative trait loci using recombinant inbred intercrosses. Genetics 190:475-86
Liu, Fei; Dunson, David; Zou, Fei (2011) High-dimensional variable selection in meta-analysis for censored data. Biometrics 67:504-12
Yuan, Zhongshang; Zou, Fei; Liu, Yanyan (2011) Bayesian multiple quantitative trait loci mapping for recombinant inbred intercrosses. Genetics 188:189-95

Showing the most recent 10 out of 25 publications