and Relevance The primary goal of this project is to develop a novel, integrated approach for the analysis of high-throughput cancer genomic data. We plan to develop new variable selection methods for 1) class discovery, that is we propose to determine subgroups of the specified cancer to better understand the underlying cancer biology and 2) predictive gene signatures, that is we propose to determine a subset of genes which are predictive for patients'clinical phenotypes, including survival and response to therapy. Specifically, we will develop a new method for variable selection in clustering. Clustering plays a critical role in the analysis of genomic cancer data. For example, based on the gene expression profiles, important cluster distinctions can be found among a set of tissue samples, which may reflect categories of diseases, mutation status, or different responses to a given therapy. Second, we will develop a new penalized-likelihood method for variable selection in regression which utilizes group information to select groups of correlated genes that share the same biological pathway. The developed methodology will be useful for identifying important gene signatures that may lead to more effective personalized treatment in any health studies where survival time or response to therapy is of interest.
Our project aims to develop a new class of variable selection methods for analyzing high-throughput cancer genomic data. Compared to existing methods, the proposed methods will lead to more powerful methods of class discovery for identifying cancer sub-types and more accurate prediction of patients'survival and response to therapy. The developed methodology will be useful for identifying important gene signatures that may lead to more effective personalized treatment in any health studies where survival time or response to therapy is of interest.
|Lin, Huazhen; Fei, Zhe; Li, Yi (2016) A semiparametrically efficient estimator of the time-varying effects for survival data with time-dependent treatment. Scand Stat Theory Appl 43:649-663|
|Li, Yi; Dicker, Lee; Zhao, Sihai Dave (2014) The Dantzig Selector for Censored Linear Regression Models. Stat Sin 24:251-2568|
|Eng, Kevin H; Hanlon, Bret M (2014) Discrete mixture modeling to address genetic heterogeneity in time-to-event regression. Bioinformatics 30:1690-7|
|Xu, Peirong; Zhu, Lixing; Li, Yi (2014) Ultrahigh dimensional time course feature selection. Biometrics 70:356-65|
|Zhou, Ling; Lin, Huazhen; Song, Xinyuan et al. (2014) Selection of latent variables for multiple mixed-outcome models. Scand Stat Theory Appl 41:1064-1082|