The long term objective of this project is to develop powerful and computationally efficient statistical methods of identifying genes underlying complex genetic diseases in humans.
The specific aim of this project is to continue to develop survival models to incorporate age of onset data, environmental covariates information, gene-environment interactions, and multiple disease loci into family-based association analysis, joint linkage and linkage disequilibrium analyses, and multipoint multi-trait-locus linkage analysis of complex human diseases. The proposed methods build on our current methods and hinge on novel integration of methods in multivariate survival analysis and methods in modern human genetics. The focus will be on the development of survival models for: (1) incorporating age of onset and environmental risk factors into genetic association study using a linkage disequlibrium based Cox model for family data of any size; (2) joint analysis of linkage and linkage disequilibrium for age of onset data based on nuclear families; (3) for multipoint multi-trait-locus linkage tests that can incorporate age of onset and environmental covariates data using the additive genetic frailty model. The project will also investigate the power and efficiencies of these methods, and compare them with existing methods. In addition, this project will develop practical and feasible computer programs in order to implement the proposed methods, to evaluate the performance of these methods through extensive simulations and application to real data on HLA-associated diseases, including type 1 diabetes, rheumatoid arthritis, celiac disease, narcolepsy, and ankylosing spondylitis. The work proposed here will contribute both statistical methodology to mapping genes for complex diseases and multivariate survival analysis, offer insight into each of the clinical areas represented by the various data sets to evaluate these new methods, and facilitate final identification of genes involved in these complex diseases. All programs developed under this grant and detailed documentations will be made available free-of-charge to interested researchers via the World Wide Web.
|Yin, Jianxin; Li, Hongzhe (2013) Adjusting for High-dimensional Covariates in Sparse Precision Matrix Estimation by ýýý1-Penalization. J Multivar Anal 116:365-381|
|Vardhanabhuti, Saran; Li, Mingyao; Li, Hongzhe (2013) A Hierarchical Bayesian Model for Estimating and Inferring Differential Isoform Expression for Multi-Sample RNA-Seq Data. Stat Biosci 5:119-137|
|Yin, Jianxin; Li, Hongzhe (2012) Model Selection and Estimation in the Matrix Normal Graphical Model. J Multivar Anal 107:119-140|
|Kalli, Anastasia; Hess, Sonja (2012) Effect of mass spectrometric parameters on peptide and protein identification rates for shotgun proteomic experiments on an LTQ-orbitrap mass analyzer. Proteomics 12:21-31|
|Daye, Z John; Xie, Jichun; Li, Hongzhe (2012) A Sparse Structured Shrinkage Estimator for Nonparametric Varying-Coefficient Model with an Application in Genomics. J Comput Graph Stat 21:110-133|
|Daye, Z John; Li, Hongzhe; Wei, Zhi (2012) A powerful test for multiple rare variants association studies that incorporates sequencing qualities. Nucleic Acids Res 40:e60|
|Sun, Hokeun; Li, Hongzhe (2012) Robust Gaussian graphical modeling via l1 penalization. Biometrics 68:1197-206|
|Cai, T Tony; Jeng, X Jessie; Li, Hongzhe (2012) Robust Detection and Identification of Sparse Segments in Ultra-High Dimensional Data Analysis. J R Stat Soc Series B Stat Methodol 74:773-797|
|Nguyen, Le B; Diskin, Sharon J; Capasso, Mario et al. (2011) Phenotype restricted genome-wide association study using a gene-centric approach identifies three low-risk neuroblastoma susceptibility Loci. PLoS Genet 7:e1002026|
|Xie, Jichun; Cai, T Tony; Li, Hongzhe (2011) Sample size and power analysis for sparse signal recovery in genome-wide association studies. Biometrika 98:273-290|
Showing the most recent 10 out of 52 publications