An important goal in cancer research is to identify genomic biomarkers that can be used to obtain a better understanding of the genetic basis of cancers, and construct models that can be used to predict cancer occurrence and progression. Many studies have used microarrays to identify genes that have altered expression levels in various cancer tissues. Meta analysis makes it possible to (1) effectively combine experiments with different microarray platforms and/or other setup;(2) lead to more reliable and consistent gene identification results across studies and more satisfactory predictions;and (3) identify genes that are commonly activated in different types of cancer. The proposed study is the first to investigate novel regularized methods for microarray meta analysis where cancer clinical outcomes are measured along with gene expressions in multiple independent experiments. The proposed approaches can (1) effectively combine data from different platforms/ experimental setup;(2) carry out efficient biomarker selection and predictive model building simultaneously;and (3) identify influential genes that are important across different experiments, while allowing for experiment-specific predictive models.
The specific aims of this study include: (1) Develop MTGDR (Meta Threshold Gradient Directed Regularization) method for regularized microarray meta analysis. (2) Develop penalized group-bridge method for regularized microarray meta analysis. (3) Apply the proposed general methodologies to cancer classification and survival analysis with microarray data. Develop user-friendly R packages implementing the proposed approaches and make them publicly available. We will consider cancer microarray meta analysis where individual experiments can have categorical clinical outcomes and right censored survival outcomes. Analysis of practical cancer studies and extensive simulations will be conducted to assess performance of proposed approaches and compare with alternatives. In this application, we emphasize not only development of new general methodologies, but also their computer implementation, applications and empirical performances.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Small Research Grants (R03)
Project #
5R03LM009828-02
Application #
7685435
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Ye, Jane
Project Start
2008-09-01
Project End
2011-08-31
Budget Start
2009-09-01
Budget End
2011-08-31
Support Year
2
Fiscal Year
2009
Total Cost
$79,052
Indirect Cost
Name
Yale University
Department
Public Health & Prev Medicine
Type
Schools of Medicine
DUNS #
043207562
City
New Haven
State
CT
Country
United States
Zip Code
06520
Huang, Yuan; Huang, Jian; Shia, Ben-Chang et al. (2012) Identification of cancer genomic markers via integrative sparse boosting. Biostatistics 13:509-22
Huang, Jian; Ma, Shuangge; Li, Hongzhe et al. (2011) The Sparse Laplacian Shrinkage Estimator for High-Dimensional Regression. Ann Stat 39:2021-2046
Ma, Shuangge; Huang, Jian; Wei, Fengrong et al. (2011) Integrative analysis of multiple cancer prognosis studies with gene expression measurements. Stat Med 30:3361-71
Ma, Shuangge; Huang, Jian; Song, Xiao (2011) Integrative analysis and variable selection with multiple high-dimensional data sets. Biostatistics 12:763-75
Ma, Shuangge; Kosorok, Michael R (2010) Detection of gene pathways with predictive power for breast cancer prognosis. BMC Bioinformatics 11:1
Ma, Shuangge; Huang, Jian; Moran, Meena S (2009) Identification of genes associated with multiple cancers via integrative analysis. BMC Genomics 10:535
Ma, Shuangge; Huang, Jian (2009) Regularized gene selection in cancer microarray meta-analysis. BMC Bioinformatics 10:1