The proposed research seeks to develop new statistical methods for assessing performance of prediction models for cancer risk and prognosis when the endpoint of interest such as patient survival or time to cancer recurrence is subject to potentially dependent censoring, which is often present in observational and epidemiological studies. The significance of prediction models for cancer risk and prognosis has been well established: they can be used to identify individuals at high risk, plan interventional trials and subsequently design and improve personalized prevention and treatment strategies, and estimate the population burden, the cost of cancer, and the impact of potential interventions and treatments. In order to identify optimal (or better) prediction models, it is crucial to develop robust predictive accuracy metrics for assessing and comparing prediction models. Predictive accuracy metrics that do not adjust for censoring mechanism likely lead to biased assessment of prediction models in the presence of dependent censoring. While a considerable amount of work has been reported on development of predictive accuracy metrics, there has been only limited work on predictive accuracy metrics for censored data, most of which have been developed for the case of independent censoring and limited to Cox proportional hazard models. In addition, owing to major advances in technology, it has become increasingly common that high-dimensional biomarkers such as genomic and proteomic data are collected in cancer research studies and modern statistical methods have been developed to utilize these high-dimensional data when constructing prediction models, which presents another challenge for assessing predictive accuracy in the presence of dependent censoring. These considerations lead to our specific aims as follows: 1) develop new metrics to account for censoring mechanism when assessing predictive accuracy of regression models for cancer endpoints that are subject to dependent censoring;2) develop new metrics to account for censoring mechanism when assessing predictive accuracy of accelerated failure time models for cancer endpoints that are subject to dependent censoring;3) develop sensitivity analysis for the case where censoring may depend on unobserved survival times;and 4) perform systematic evaluation of predictive accuracy metrics for censored data through extensive simulations and real data analysis. The proposed statistical methods, once developed, will allow for assessment of predictive accuracy of prediction models under a wide range of settings including different censoring mechanisms and for high-dimensional data. The proposed numerical studies will shed important insight on applicability, advantages, and disadvantages of different metrics, as well as impact of censoring mechanism on these metrics, and subsequently provide better guidance to cancer researchers on how to use and interpret these metrics in research studies and in practice.
The objective is to develop statistical methods for assessing prediction models for cancer endpoints that are subject to dependent censoring in observational and epidemiological studies. The proposed statistical methods will allow for assessment of predictive accuracy under a wide range of settings including different censoring mechanisms and for high-dimensional data. The proposed numerical studies will shed important insight on applicability, advantages, and disadvantages of different metrics, as well as impact of censoring mechanism on these metrics, and subsequently provide better guidance to cancer researchers on how to use and interpret these metrics in research studies and in practice.
|Pellegrini, Kathryn L; Sanda, Martin G; Patil, Dattatraya et al. (2017) Evaluation of a 24-gene signature for prognosis of metastatic events and prostate cancer-specific mortality. BJU Int 119:961-967|
|Hu, Yi-Juan; Schmidt, Amand F; Dudbridge, Frank et al. (2017) Impact of Selection Bias on Estimation of Subsequent Event Risk. Circ Cardiovasc Genet 10:|
|Safo, Sandra E; Li, Shuzhao; Long, Qi (2017) Integrative analysis of transcriptomic and metabolomic data via sparse canonical correlation analysis with incorporation of biological information. Biometrics :|
|Zhao, Yize; Chung, Matthias; Johnson, Brent A et al. (2016) Hierarchical Feature Selection Incorporating Known and Novel Biological Information: Identifying Genomic Features Related to Prostate Cancer Recurrence. J Am Stat Assoc 111:1427-1439|
|Wang, Ming; Long, Qi (2016) Addressing issues associated with evaluating prediction models for survival endpoints based on the concordance statistic. Biometrics 72:897-906|
|Torres, Mylin A; Yang, Xiaofeng; Noreen, Samantha et al. (2016) The Impact of Axillary Lymph Node Surgery on Breast Skin Thickening During and After Radiation Therapy for Breast Cancer. Int J Radiat Oncol Biol Phys 95:590-6|
|Long, Qi; Johnson, Brent A (2015) Variable selection in the presence of missing data: resampling and imputation. Biostatistics 16:596-610|
|Tu, Huakang; Sun, Liping; Dong, Xiao et al. (2015) Temporal changes in serum biomarkers and risk for progression of gastric precancerous lesions: a longitudinal study. Int J Cancer 136:425-34|
|Long, Qi; Xu, Jianpeng; Osunkoya, Adeboye O et al. (2014) Global transcriptome analysis of formalin-fixed prostate cancer specimens identifies biomarkers of disease recurrence. Cancer Res 74:3228-37|
|Hsu, Chiu-Hsieh; Long, Qi; Li, Yisheng et al. (2014) A nonparametric multiple imputation approach for data with missing covariate values with application to colorectal adenoma data. J Biopharm Stat 24:634-48|