Although comparative effectiveness research (CER) in oncology has attracted substantial attention to provide timely treatment comparisons and improve health outcomes, considerable methodological gaps remain for utilizing multiple sources of data together with efficient statistical methods to assemble evidence in CER. The proposed study is directly motivated by our collaborations with breast cancer medical oncologists and surgeons in the investigation of inflammatory breast cancer (IBC), a rare but aggressive form of breast cancer. The primary objective of this proposal is to develop statistical methods and risk prediction models by combining cohort data containing detailed tumor biology variables with aggregate information with or without sampling error from population-based registry databases. In this project, (Aim 1) we propose statistical methods to utilize aggregate information from external data when analyzing primary cohort data with individual patient level data under both parametric and semiparametric models for survival data, and to provide a test procedure to evaluate the comparability of the information from primary cohort data and that from external data. We will further generalize the approaches to account for uncertainty of the aggregate information in the estimation and inference procedures for survival data (Aim 2). Furthermore, (Aim 3) we will link the primary cohort data with detailed risk profiles to external data without detailed risk factors to develop a novel comprehensive IBC-specific mortality risk prediction model, and provide an estimating approach to evaluate the performance of the established risk prediction model. From an application perspective, our proposed methods of maximizing the use of existing IBC cohort data by combining them with external registry databases is cost-effective and may directly improve evidence-based treatment guidelines for IBC patients. Although motivated by IBC research, the statistical methods will be useful for addressing the challenges of CER in any chronic disease, especially for rare diseases. All software for analytical and statistical tools developed in this project, once validated, will be made available to the broader research community.
Population-based cancer registry datasets are valuable resources to fill the information gaps created by a lack of evidence from clinical trials and observational studies with small to moderate sample sizes. We propose new statistical tools in estimation, inference and risk prediction by efficiently combining data from a primary cohort that includes detailed risk factors at individual patient level with external population-based cancer registry data. The methods will be applied to inflammatory breast cancer (IBC) studies, and may help inform optimal IBC patient care.
Showing the most recent 10 out of 11 publications