The long-term objective of this project is to develop powerful and computationally-efficient statistical methods of identifying genes, environmental risk factors and their interactions underlying complex traits related to human diseases and health.
The specific aim of this project is to continue to develop survival analysis and genetics models to incorporate age of onset data, environmental covariates information, gene-gene and gene-environment interactions, and multiple disease loci into haplotype-based genetic association analysis, analysis of single nucleotide polymorphisms (SNPs), and admixture mapping of complex traits in population- based cohort studies. The project also evaluates different study designs in genetic association studies. The proposed methods build on current approaches and hinge on novel integration of methods in survival analysis, high-dimensional data analysis and methods in human genetics. The focus will be on the development of rigorous and comprehensive statistical inference procedures for haplotype analysis, gene- gene and gene-environment interaction analysis and admixture mapping in cohort studies of unrelated individuals collected by different study designs, including case-cohort and nested case-control designs. Likelihood-based inferences, hidden Markov models, and threshold gradient descent methods will be developed for these aims. The project will also investigate the robustness, power and efficiencies of these methods, and compare them with existing methods. In addition, this project will develop practical and feasible computer programs in order to implement the proposed methods, and to evaluate the performance of these methods through simulation and application to real data on breast and ovarian cancer risks among the BRCA1/2 carriers and to data sets in the area of pharmacogenomics. The work proposed here will contribute both statistical methodology to studying complex traits and methods for high-dimensional data analysis, and offer insight into each of the clinical areas represented by the various data sets to evaluate these new methods. All programs developed under this grant and detailed documentation will be made available free-of-charge to interested researchers via the World Wide Web.

Agency
National Institute of Health (NIH)
Institute
National Institute of Environmental Health Sciences (NIEHS)
Type
Research Project (R01)
Project #
5R01ES009911-12
Application #
7683874
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Mcallister, Kimberly A
Project Start
1998-09-30
Project End
2011-08-31
Budget Start
2009-09-01
Budget End
2011-08-31
Support Year
12
Fiscal Year
2009
Total Cost
$284,760
Indirect Cost
Name
University of Pennsylvania
Department
Dermatology
Type
Schools of Medicine
DUNS #
042250712
City
Philadelphia
State
PA
Country
United States
Zip Code
19104
Yin, Jianxin; Li, Hongzhe (2013) Adjusting for High-dimensional Covariates in Sparse Precision Matrix Estimation by ýýý1-Penalization. J Multivar Anal 116:365-381
Vardhanabhuti, Saran; Li, Mingyao; Li, Hongzhe (2013) A Hierarchical Bayesian Model for Estimating and Inferring Differential Isoform Expression for Multi-Sample RNA-Seq Data. Stat Biosci 5:119-137
Yin, Jianxin; Li, Hongzhe (2012) Model Selection and Estimation in the Matrix Normal Graphical Model. J Multivar Anal 107:119-140
Kalli, Anastasia; Hess, Sonja (2012) Effect of mass spectrometric parameters on peptide and protein identification rates for shotgun proteomic experiments on an LTQ-orbitrap mass analyzer. Proteomics 12:21-31
Daye, Z John; Xie, Jichun; Li, Hongzhe (2012) A Sparse Structured Shrinkage Estimator for Nonparametric Varying-Coefficient Model with an Application in Genomics. J Comput Graph Stat 21:110-133
Daye, Z John; Li, Hongzhe; Wei, Zhi (2012) A powerful test for multiple rare variants association studies that incorporates sequencing qualities. Nucleic Acids Res 40:e60
Sun, Hokeun; Li, Hongzhe (2012) Robust Gaussian graphical modeling via l1 penalization. Biometrics 68:1197-206
Cai, T Tony; Jeng, X Jessie; Li, Hongzhe (2012) Robust Detection and Identification of Sparse Segments in Ultra-High Dimensional Data Analysis. J R Stat Soc Series B Stat Methodol 74:773-797
Nguyen, Le B; Diskin, Sharon J; Capasso, Mario et al. (2011) Phenotype restricted genome-wide association study using a gene-centric approach identifies three low-risk neuroblastoma susceptibility Loci. PLoS Genet 7:e1002026
Xie, Jichun; Cai, T Tony; Li, Hongzhe (2011) Sample size and power analysis for sparse signal recovery in genome-wide association studies. Biometrika 98:273-290

Showing the most recent 10 out of 52 publications