A new mouse resource, the Collaborative Cross (CC) will provide access to the most diverse mouse strains ever created which will more closely reflect the genetic variation in humans. Outbred recombinant inbred intercrosses (RIX) can be generated by producing F1 hybrids of parental CC RI lines to mimic human populations. CC RIX will greatly enhance our ability to understand some of today's most common and complex diseases. The combined information on genotype, expression, and complex phenotypes of RIX will be among the richest ever compiled. The success of the CC project relies heavily on good experimental designs and appropriate statistical analysis, which we address in this proposal. The ultimate goal of the proposal is to provide scientists working on CC mice with a statistical analysis platform, which contains specially designed analytical tools for CC mouse data, ranging from simple univariate analysis to more complicated multivariate and longitudinal data analysis, and highly complex integrated high-dimensional data analysis.
The specific aims of this project are: 1) developing appropriate univariate analysis tools that account for the special relatedness structure of CC RIX samples;2) extending the analysis methods in Aim 1 to more complicated longitudinal and multivariate phenotypes, and to selected phenotypes;3) joint modeling of the relationship between DNA, gene expression, and phenotype, and 4) developing strategies for selecting CC RIX lines for predictive biology and for accurate phenotypic prediction. The proposed project not only addresses common analytical challenges faced by most high-dimensional genome-wide genetic studies, but also identifies unique features of CC projects, such as phenotype selection, and develops novel statistical methods to address these unique features. The performance of the proposed methods will be evaluated by extensive simulation studies with a wide range of simulation setups and genetic models. Software to carry out the specific aims will be developed and implemented in R or C computing environments for public distribution.

Public Health Relevance

A new mouse resource, the Collaborative Cross (CC) will provide access to the most diverse mouse strains ever created which will more closely reflect the genetic variation in humans. They will greatly enhance our ability to understand some of today's most common and complex diseases. The success of the CC project relies heavily on good experimental designs and appropriate statistical analysis, which we address in this proposal. The ultimate goal of the proposal is to provide scientists working on CC mice with a statistical analysis platform, which contains specially designed analytical tools for CC mouse data, ranging from simple univariate analysis to more complicated multivariate and longitudinal data analysis, and highly complex integrated high-dimensional data analysis. The proposed project not only addresses common analytical challenges faced by most high-dimensional genome-wide genetic studies, but also identifies unique features of CC projects, such as phenotype selection and CC line selection for predictive biology, and develops novel statistical methods to address these unique features.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM074175-08
Application #
8711483
Study Section
Genetic Variation and Evolution Study Section (GVE)
Program Officer
Krasnewich, Donna M
Project Start
2006-04-01
Project End
2015-08-31
Budget Start
2014-09-01
Budget End
2015-08-31
Support Year
8
Fiscal Year
2014
Total Cost
Indirect Cost
Name
University of North Carolina Chapel Hill
Department
Biostatistics & Other Math Sci
Type
Schools of Public Health
DUNS #
City
Chapel Hill
State
NC
Country
United States
Zip Code
27599
Crowley, James J; Zhabotynsky, Vasyl; Sun, Wei et al. (2015) Analyses of allele-specific gene expression in highly divergent mouse crosses identifies pervasive allelic imbalance. Nat Genet 47:353-60
Lu, Zhao-Hua; Zhu, Hongtu; Knickmeyer, Rebecca C et al. (2015) Multiple SNP Set Analysis for Genome-Wide Association Studies Through Bayesian Latent Variable Selection. Genet Epidemiol 39:664-77
Sun, Wei; Liu, Yufeng; Crowley, James J et al. (2015) IsoDOT Detects Differential RNA-isoform Expression/Usage with respect to a Categorical or Continuous Covariate with High Sensitivity and Specificity. J Am Stat Assoc 110:975-986
Yin, Zhaoyu; Xia, Kai; Chung, Wonil et al. (2015) Fast eQTL Analysis for Twin Studies. Genet Epidemiol 39:357-65
Ha, Min Jin; Sun, Wei (2014) Partial correlation matrix estimation using ridge penalty followed by thresholding and re-estimation. Biometrics 70:765-73
Wright, Fred A; Sullivan, Patrick F; Brooks, Andrew I et al. (2014) Heritability and genomics of gene expression in peripheral blood. Nat Genet 46:430-7
Rashid, Naim U; Sun, Wei; Ibrahim, Joseph G (2014) Some Statistical Strategies for DAE-seq Data Analysis: Variable Selection and Modeling Dependencies among Observations. J Am Stat Assoc 109:78-94
Zou, Fei; Sun, Wei; Crowley, James J et al. (2014) A novel statistical approach for jointly analyzing RNA-Seq data from F1 reciprocal crosses and inbred lines. Genetics 197:389-99
Lee, Seunggeun; Zou, Fei; Wright, Fred A (2014) Convergence of Sample Eigenvalues, Eigenvectors, and Principal Component Scores for Ultra-High Dimensional Data. Biometrika 101:484-490
Xia, Kai; Yu, Yang; Ahn, Mihye et al. (2014) Environmental and genetic contributors to salivary testosterone levels in infants. Front Endocrinol (Lausanne) 5:187

Showing the most recent 10 out of 41 publications