Individualized prognostic models abound in clinical biomedicine. They are used to make predictions of the future, derived from individual patient characteristics, and will play increasingly important roles in the move towards per- sonalized medicine. They can be used in the settings of early detection and screening, or after a cancer diagnosis to help decide on treatment, or after treatment to monitor for progression and recurrence. While some models are well established, they likely have the potential to be improved through the use of additional variables. Larger and better quality training datasets and improved statistical models and methods will improve their accuracy, but the potential for largest improvement is through new biomarkers. Since cancer is a heterogenous disease with multifactorial etiology, many clinical and molecular factors will likely aid in predicting the future for a patient, and would be candidates for inclusion in a new model. The challenge we will address in this research is how to de- velop a new model that both includes the new biomarkers and makes use of the knowledge implicit in the existing models, when the datasets that are available containing the new biomarkers are only of modest size. To develop a new model from a new dataset of modest size that contains the new biomarkers, the typical approach will be to analyze these data, as a separate entity, and build a model based on that analysis. However, this approach does not utilize the external information from an established model. Such external information will often be available, however it may come in the form of regression coef?cients, odds ratios or other summary statistics for a subset of the variables, or in the form of a prediction from an online calculator. We will consider a variety of statistical methods for incorporating the external information. The methods we propose to develop are motivated by speci?c head and neck cancer and prostate cancer stud- ies, but have much broader applicability to other cancers and other diseases. In the head and neck study the additional new biomarkers to be incorporated in to the prediction models are HPV status and other molecular biomarkers. For the prostate cancer risk prediction model the new bimarkers are based on proteins measured from urine. The research is separated into three speci?c aims. The ?rst aim considers the situation in which there is a modest sized new dataset, that includes a new biomarker, and there is an existing prediction model, that does not include this new biomarker. The external information comes in the form of estimates and standard errors of regression parameters from an established prediction model based on a subset of the predictors. We propose a number of different frequentist and Bayesian methods, in which the information on the lower dimensional parameter space is used via inequality constraints and Lagrange multipliers, through prior distributions and through a novel transformation approach. The properties of the approaches will be compared in the situation of continuous and binary response variables. In the second aim the external information comes in the form of a prediction from one or more calculators, and speci?cally the predictions for each individual in our own data are used. We include in this aim consideration of the situation where there are multiple established prediction models and where the outcome variable is the survival time. We consider different possible methodological approaches, one is an adaptation of the methods in the ?rst aim, a second very general method is to incorporate synthetic data generated from the existing models and a third general method uses weights that enable the new biomarker to have a stronger role for observations that were were not predicted well by the existing models. In the third aim we consider the situation where there may be a panel of new biomarkers, and there is also knowledge about the unadjusted association between each new biomarker and the outcome variable, as might be available from a genome-wide association study. A novel nonparametric Bayes approach is proposed to solve this problem.

Public Health Relevance

Individualized prognostic models and risk calculators have increasing importance in personalized medicine. There are many such established models that use standard variables as inputs. An open question is how can new and important biomarkers be included as additional inputs in a new prognostic model, derived from a possibly small or modest sized dataset, while also using the established model. When there are many prognostic models for a particular situation, an open question is how can all the predictions from the existing models be combined to lead to a single improved model that includes the new biomarkers. We will perform statistical research to address these open questions. The research will be evaluated for prognosis in head and neck cancer and early detection of prostate cancer, and will be applicable to other cancers and other diseases.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Research Project (R01)
Project #
Application #
Study Section
Cancer Biomarkers Study Section (CBSS)
Program Officer
Feuer, Eric J
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Michigan Ann Arbor
Biostatistics & Other Math Sci
Schools of Public Health
Ann Arbor
United States
Zip Code
Shen, Jincheng; Wang, Lu; Daignault, Stephanie et al. (2018) Estimating the Optimal Personalized Treatment Strategy Based on Selected Variables to Prolong Survival via Random Survival Forest with Weighted Bootstrap. J Biopharm Stat 28:362-381
Schmitd, Ligia B; Beesley, Lauren J; Russo, Nickole et al. (2018) Redefining Perineural Invasion: Integration of Biology With Clinical Outcome. Neoplasia 20:657-667
Cho, Youngjoo; Hu, Chen; Ghosh, Debashis (2018) Covariate adjustment using propensity scores for dependent censoring problems in the accelerated failure time model. Stat Med 37:390-404
Manohar, Poorni M; Beesley, Lauren J; Bellile, Emily L et al. (2018) Prognostic Value of FDG-PET/CT Metabolic Parameters in Metastatic Radioiodine-Refractory Differentiated Thyroid Cancer. Clin Nucl Med 43:641-647
Cheng, Wenting; Taylor, Jeremy M G; Vokonas, Pantel S et al. (2018) Improving estimation and prediction in linear regression incorporating external information from an established reduced model. Stat Med 37:1515-1530
Boonstra, Philip S; Barbaro, Ryan P (2018) Incorporating historical models with adaptive Bayesian updates. Biostatistics :
Shen, Jincheng; Wang, Lu; Taylor, Jeremy M G (2017) Estimation of the optimal regime in treatment of prostate cancer recurrence from observational data using flexible weighting models. Biometrics 73:635-645
Suresh, Krithika; Taylor, Jeremy M G; Spratt, Daniel E et al. (2017) Comparison of joint modeling and landmarking for dynamic prediction under an illness-death model. Biom J 59:1277-1300
Conlon, Asc; Taylor, Jmg; Elliott, M R (2017) Surrogacy assessment using principal stratification and a Gaussian copula model. Stat Methods Med Res 26:88-107
Conlon, Anna; Taylor, Jeremy; Li, Yun et al. (2017) Links between causal effects and causal association for surrogacy evaluation in a gaussian setting. Stat Med 36:4243-4265

Showing the most recent 10 out of 56 publications