? ? This application is submitted in response to PAR-04-159 ``Small Grants Program for Cancer Epidemiology'' `Analyzing existing data that otherwise may have gone unexplored, such as pooled analysis of data from multiple studies coordinated into consortia.' A quick review of research articles in current literature, there are only tens of paper related to cancer study with integrated SNP and gene expression data. The lack of algorithms and user-friendly software to mine the existing data may be partly blamed. The proposed study will provide useful tools to mine the genetic data from different sources. This pilot project focuses on developing efficient algorithms for clustering, molecular network construction, and biomarker discovery with integrated SNP and gene expression data. With the efficient data mining methods we develop, cancer researchers may get much more useful information from the otherwise unexplored data. Therefore it has broad implications in the analysis of all high priority areas in cancer epidemiology research identified by Progress Review Groups, such as multiple myeloma and cancers of the breast, colon/rectum, prostate, lung, pancreas, and brain, and linking genetic polymorphisms with other variable related to cancer risk. Upon complete the proposed research, the methods/algorithms developed can potentially be applied to other mixed data sources such as methylation, gene expression, and others. We hope our researches have the impact of encouraging more people to contribute to this challenging problem. ? ? ?
Liu, Zhenqiu; Lin, Shili; Tan, Ming T (2010) Sparse support vector machines with Lp penalty for biomarker identification. IEEE/ACM Trans Comput Biol Bioinform 7:100-7 |
Liu, Zhenqiu; Gartenhaus, Ronald B; Chen, Xue-Wen et al. (2009) Survival prediction and gene identification with penalized global AUC maximization. J Comput Biol 16:1661-70 |
Liu, Zhenqiu; Gartenhaus, Ronald B; Tan, Ming et al. (2008) Gene and pathway identification with Lp penalized Bayesian logistic regression. BMC Bioinformatics 9:412 |
Liu, Zhenqiu; Tan, Ming (2008) ROC-based utility function maximization for feature selection and classification with applications to high-dimensional protease data. Biometrics 64:1155-61 |