A new strategy in cancer prognosis is to base the decision on integrated information from different sources, including the traditional clinical and demographic information of patients, such as age, grade, and tumor size, etc, and the recently emerged genetic information like expression of gene or protein markers. Implementation of such a strategy requires efficient quantitative models that integrate the clinical measurements and genetic measurements together for prognosis. The long-range goal of this application is to improve risk predication, treatment selection, and subtype classification in cancer prevention, diagnosis, and prognosis. The short-term objective is to improve prediction of treatment response for cancer patients by developing innovative statistical models that integrate three different types of data, including two subtypes of informatics data, namely protein pathway data and high-throughput protein expression data, and a third type, which is the standard clinical and demographic data. We will accomplish the objective of this application by pursuing the following five specific aims: 1) Develop Bayesian parametric models that integrate a known genetic pathway with high-throughput protein expression measurements. 2) Develop Bayesian nonparametric model that integrate multiple genetic pathways with protein expression measurements. 3) Develop Bayesian classification procedures based on the Bayesian models proposed in previous two aims. 4) Integrate clinical and demographic measurements into the Bayesian models and apply the Bayesian classification procedures using a comprehensive data set that contains protein expression measurements and clinical measurements for more than 500 patients with leukemia. 5) Validate statistical findings by performing biological experiments, which will be done by our collaborating biologists. The proposed research is expected to provide quantitative prognostic tools for oncologists based on integrated information. The impact of the proposed research will be significant because models developed in this application can be applied to various cancer types and thus potentially improve the prognosis for patients with different types of cancer.

Public Health Relevance

Integrating the protein expression data, the protein pathway data, and the clinical data is expected to significantly improve medical decision making such as treatment selection. The improved decisions are expected to improve the overall patient care. For example, by accurately predicting that certain treatment will not be effective for a cancer patient, this patient will no longer waste time trying out the treatment and will have a better chance finding some other more effective therapies.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-HOP-T (02))
Program Officer
Dunn, Michelle C
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Northshore University Healthsystem
United States
Zip Code
Zhu, Yitan; Qiu, Peng; Ji, Yuan (2014) TCGA-assembler: open-source software for retrieving and processing TCGA data. Nat Methods 11:599-600
Pan, Haitao; Xie, Fang; Liu, Ping et al. (2014) A phase I/II seamless dose escalation/expansion with adaptive randomization scheme (SEARS). Clin Trials 11:49-59
Lee, Juhee; Muller, Peter; Zhu, Yitan et al. (2013) A Nonparametric Bayesian Model for Local Clustering with Application to Proteomics. J Am Stat Assoc 108:
Hu, Bo; Bekele, B Nebiyou; Ji, Yuan (2013) Adaptive dose insertion in early phase clinical trials. Clin Trials 10:216-24
Trentini, Filippo; Ji, Yuan; Iwamoto, Takayuki et al. (2013) Bayesian mixture models for assessment of gene differential behaviour and prediction of pCR through the integration of copy number and gene expression data. PLoS One 8:e68071
Mitra, Riten; Muller, Peter; Liang, Shoudan et al. (2013) Toward breaking the histone code: bayesian graphical models for histone modifications. Circ Cardiovasc Genet 6:419-26
Hu, Bo; Ji, Yuan; Xu, Yaomin et al. (2013) Screening for SNPs with Allele-Specific Methylation based on Next-Generation Sequencing Data. Stat Biosci 5:179-197
Xie, Fang; Ji, Yuan; Tremmel, Lothar (2012) A Bayesian adaptive design for multi-dose, randomized, placebo-controlled phase I/II trials. Contemp Clin Trials 33:739-48
Berkova, Zuzana; Wang, Shu; Wise, Jillian F et al. (2009) Mechanism of Fas signaling regulation by human herpesvirus 8 K1 oncoprotein. J Natl Cancer Inst 101:399-411