This is a proposal to develop, refine, and implement statistical methods for comparative effectiveness research in cancer. Using these methods, we will emulate complex randomized trials of personalized and dynamic strategies for treatment and diagnosis of cancer. We will focus our efforts on three areas. First, the comparative effectiveness of personalized strategies for the diagnostic work-up of cancer patients. Because questions about the allocation of health care for diagnostic work-up involves the comparison of strategies that are clearly assigned in the data, conventional statistical methods cannot be used and alternative methods, like artificial censoring plus inverse probability weighting, are required.
In Specific Aim 1, we will develop methods for the comparison of personalized health care delivery strategies in cancer diagnosis. As a motivating example, we will use SEER-Medicare to compare personalized strategies for the attendance to multiple medical centers by patients'and providers'characteristics.
In Specific Aim 2, we will develop methods for the comparison of personalized strategies in the presence of unmeasured confounding. Subgroup analyses are important in identifying subpopulations for which treatment is most effective, and thus to personalize treatment. Because confounding may change the interpretation of subgroup analyses and lead to bias in effect modification and interaction parameters, we will develop methodology to assess and correct for the impact of unmeasured confounding in the design, analysis, and interpretation of subgroup analyses in comparative effectiveness research.
In Specific Aim 3, we will develop methods for the comparison of dynamic health care delivery strategies in cancer treatment. Because many questions in cancer involve clinical decisions that depend on the evolving responses of patients or the characteristics of the local health care system, i.e., they are dynamic decisions, conventional statistical methods cannot be used and alternative methods, like dynamic marginal structural models and the parametric g-formula, are required. As a motivating example, we will compare dynamic strategies regarding androgen deprivation therapy for prostate cancer in the CaPSURE cohort.
In Specific Aim 4, we will develop user-friendly, high quality and open access software to be distributed to cancer researchers. This project complements the descriptive and inferential aims in Projects 2 and 3, and relies heavily on the Statistical Computing Core, and the organizational infrastructure, team building strategies, provided through the Administrative Core.

Public Health Relevance

This project will provide innovative and practical statistical methods to study the effectiveness of clinical strategies for diagnosis and treatment of cancer patients using data from large and complex observational studies. These methods will lead to a better development and use of observational data for cancer research, and a better care of cancer patients, and will assist clinicians and patients in their decision making process, and will inform clinical guidelines.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Research Program Projects (P01)
Project #
Application #
Study Section
Special Emphasis Panel (ZCA1-RPRB-2 (M1))
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Harvard University
United States
Zip Code
Bind, M-A C; Vanderweele, T J; Coull, B A et al. (2016) Causal mediation analysis for longitudinal data with exogenous exposure. Biostatistics 17:122-34
Hernán, Miguel A; Robins, James M (2016) Using Big Data to Emulate a Target Trial When a Randomized Trial Is Not Available. Am J Epidemiol :
Chen, Jun; Just, Allan C; Schwartz, Joel et al. (2016) CpGFilter: model-based CpG probe filtering with replicates for epigenome-wide association studies. Bioinformatics 32:469-71
Lin, Xinyi; Lee, Seunggeun; Wu, Michael C et al. (2016) Test for rare variants by environment interactions in sequencing association studies. Biometrics 72:156-64
Lee, Kyu Ha; Tadesse, Mahlet G; Baccarelli, Andrea A et al. (2016) Multivariate Bayesian variable selection exploiting dependence structure among outcomes: Application to air pollution effects on DNA methylation. Biometrics :
Yung, Godwin; Lin, Xihong (2016) Validity of using ad hoc methods to analyze secondary traits in case-control association studies. Genet Epidemiol 40:732-743
Arvold, Nils D; Cefalu, Matthew; Wang, Yun et al. (2016) Comparative effectiveness of radiotherapy with vs. without temozolomide in older patients with glioblastoma. J Neurooncol :
Wasfy, Jason H; Dominici, Francesca; Yeh, Robert W (2016) Letter by Wasfy et al Regarding Article, ""Facility Level Variation in Hospitalization, Mortality, and Costs in the 30 Days After Percutaneous Coronary Intervention: Insights on Short-Term Healthcare Value From the Veterans Affairs Clinical Assessment, Re Circulation 133:e376
Carere, Deanna Alexis; Kraft, Peter; Kaphingst, Kimberly A et al. (2016) Consumers report lower confidence in their genetics knowledge following direct-to-consumer personal genomic testing. Genet Med 18:65-72
Zigler, Corwin Matthew (2016) The Central Role of Bayes' Theorem for Joint Estimation of Causal Effects and Propensity Scores. Am Stat 70:47-54

Showing the most recent 10 out of 136 publications