Effective clustering penalized methods for genomic biomarker selection

Ma, Shuangge

Abstract

Cancer is a complex genetic disease, which results from accumulation of multiple genetic defects, including mutations and epigenetic changes. Advancements in microarray techniques make it possible to profile gene expressions of human tissues on a genome-wide scale, with which genomic biomarkers with predictive power for cancer diagnosis and prognosis can be discovered. Such discovery can lead to better understanding of cancer genetics, more accurate prediction of tumor behaviors, and more rational treatment selection. Effective biomarker selection is the key step connecting wet-lab studies with pharmacogenetic practice. The long term goal is to provide more effective and reliable biomarker selection methods, make more efficient use of high dimensional gene expression data, and eventually facilitate clinical practice using genomic measurements. In the present application, we will develop novel clustering penalized methods for biomarker selection in cancer studies with gene expression data. The proposed methods explicitly take into account the cluster nature of gene expressions. They are able to identify a few important gene clusters and a few important genes within those selected clusters that have influential impacts on cancer outcomes such as cancer status, response to treatment and cancer survival. They are expected to provide more accurate gene selection and better prediction than existing methods.
The specific aims are as follows. [1] Propose novel clustering penalized methods for biomarker selection at both the cluster level and the within-cluster gene level. We will propose: (a) Supervised Adaptive Group Lasso- SAGLasso;and (b) Group Bridge Lasso-GBL. We will investigate computational algorithms, tuning parameter selection, evaluation of gene selection and prediction, and large-sample statistical properties. [2] Cancer classification analysis using proposed clustering penalized approaches, where the outcome of interest is categorical cancer status or response to therapy. [3] Cancer survival analysis using proposed clustering penalized approaches, where the outcome is censored survival time. [4] Extensive numerical studies using various cancer gene expression data sets. The approaches developed in Aims 1-3 will be used to analyze ongoing studies as well as publicly available cancer microarray data. We will compare gene selection results and prediction performance of proposed approaches with existing methods. The proposed study will be the first to establish a rigorous statistical framework that explicitly accounts for the cluster nature of gene expressions in cancer biomarker selection. The proposed methods are expected to outperform existing ones in terms of gene selection and prediction performance. We will also investigate cancer classification and survival models in great details and develop efficient algorithms and portable R/S-Plus packages, which make the proposed methods easily accessible for routine biomedical data analysis.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Library of Medicine (NLM)
Type: Small Research Grants (R03)
Project #: 1R03LM009754-01A1
Application #: 7532239
Study Section: Biomedical Library and Informatics Review Committee (BLR)
Program Officer: Ye, Jane

Project Start: 2009-08-01
Project End: 2011-07-31
Budget Start: 2009-08-01
Budget End: 2010-07-31
Support Year: 1
Fiscal Year: 2009
Total Cost: $95,176
Indirect Cost

Institution

Name: Yale University
Department: Public Health & Prev Medicine
Type: Schools of Medicine
DUNS #: 043207562

City: New Haven
State: CT
Country: United States
Zip Code: 06520

Related projects


NIH 2010 R03 LM	Effective clustering penalized methods for genomic biomarker selection Ma, Shuangge / Yale University	$78,429
NIH 2009 R03 LM	Effective clustering penalized methods for genomic biomarker selection Ma, Shuangge / Yale University	$95,176

Publications

Ma, Shuangge; Dai, Ying (2011) Principal component analysis based methods in bioinformatics studies. Brief Bioinform 12:714-22

Huang, Jian; Ma, Shuangge; Li, Hongzhe et al. (2011) The Sparse Laplacian Shrinkage Estimator for High-Dimensional Regression. Ann Stat 39:2021-2046

Ma, Shuangge; Kosorok, Michael R; Huang, Jian et al. (2011) Incorporating higher-order representative features improves prediction in network-based cancer prognosis analysis. BMC Med Genomics 4:5

Ma, Shuangge; Kosorok, Michael R (2010) Detection of gene pathways with predictive power for breast cancer prognosis. BMC Bioinformatics 11:1

Ma, Shuangge; Zhang, Yawei; Huang, Jian et al. (2010) Identification of non-Hodgkin's lymphoma prognosis signatures using the CTGDR method. Bioinformatics 26:15-21

Ma, Shuangge; Huang, Jian; Shi, Mingyu et al. (2010) Semiparametric prognosis models in genomic studies. Brief Bioinform 11:385-93

Han, Xuesong; Li, Yang; Huang, Jian et al. (2010) Identification of predictive pathways for non-hodgkin lymphoma prognosis. Cancer Inform 9:281-92

Song, Xiao; Ma, Shuangge (2010) Penalized variable selection with U-estimates. J Nonparametr Stat 22:499-515

Ma, Shuangge; Shi, Mingyu; Li, Yang et al. (2010) Incorporating gene co-expression network in identification of cancer prognosis markers. BMC Bioinformatics 11:271

Ma, Shuangge; Huang, Jian; Moran, Meena S (2009) Identification of genes associated with multiple cancers via integrative analysis. BMC Genomics 10:535

Comments

Be the first to comment on Shuangge Ma's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: