Research computing and the linkage, management, and storage of very large databases, are now an essential part of cancer research. Analysis of high dimensional data for cancer research has become highly interdisciplinary and the access to specialized fields (e.g. informatics, statistical programming, computational biology. Geographic Information Systems, system administration, and data base management) is of paramount importance. While the price of computation and storage has steadily decreased, the cost of highly specialized research computing expertise and administration of large data sets has increased and will continue to increase. In addition, the large number of requirements for research data security calls for a centralized infrastructure that can provide access to secure data sets and, at the same time, enormous computational power. The overall purpose of the Statistical Computing Core (SCC) is to ensure that all Program investigators have access to linked and securely stored datasets and cutting edge statistical computing support for the three proposed research projects.
The specific aims are: 1) Provide access to state of the art research computing expertise and infrastructure (Aim 1);2) Develop a comprehensive data management structure for the Program Project (Aim 2);3) Provide access to specialized expertise such as GIS, bioinformatics;and statistical programming (Aim 3);and 4) Support and disseminate software to the research community and maintain the Program Project website (Aim 4).

Public Health Relevance

This Program Projects aims to analyze large and complex data from observational studies in cancer research with the ultimate goal of advancing cancer prevention and treatment strategies and reducing cancer burden in the US. The Statistical and Computing Core provides data and computational support and. leadership to assist Program Investigators to achieve this mission.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Research Program Projects (P01)
Project #
Application #
Study Section
Special Emphasis Panel (ZCA1-RPRB-2)
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Harvard University
United States
Zip Code
Bind, M-A C; Vanderweele, T J; Coull, B A et al. (2016) Causal mediation analysis for longitudinal data with exogenous exposure. Biostatistics 17:122-34
Hernán, Miguel A; Robins, James M (2016) Using Big Data to Emulate a Target Trial When a Randomized Trial Is Not Available. Am J Epidemiol :
Chen, Jun; Just, Allan C; Schwartz, Joel et al. (2016) CpGFilter: model-based CpG probe filtering with replicates for epigenome-wide association studies. Bioinformatics 32:469-71
Lin, Xinyi; Lee, Seunggeun; Wu, Michael C et al. (2016) Test for rare variants by environment interactions in sequencing association studies. Biometrics 72:156-64
Lee, Kyu Ha; Tadesse, Mahlet G; Baccarelli, Andrea A et al. (2016) Multivariate Bayesian variable selection exploiting dependence structure among outcomes: Application to air pollution effects on DNA methylation. Biometrics :
Yung, Godwin; Lin, Xihong (2016) Validity of using ad hoc methods to analyze secondary traits in case-control association studies. Genet Epidemiol 40:732-743
Arvold, Nils D; Cefalu, Matthew; Wang, Yun et al. (2016) Comparative effectiveness of radiotherapy with vs. without temozolomide in older patients with glioblastoma. J Neurooncol :
Wasfy, Jason H; Dominici, Francesca; Yeh, Robert W (2016) Letter by Wasfy et al Regarding Article, ""Facility Level Variation in Hospitalization, Mortality, and Costs in the 30 Days After Percutaneous Coronary Intervention: Insights on Short-Term Healthcare Value From the Veterans Affairs Clinical Assessment, Re Circulation 133:e376
Carere, Deanna Alexis; Kraft, Peter; Kaphingst, Kimberly A et al. (2016) Consumers report lower confidence in their genetics knowledge following direct-to-consumer personal genomic testing. Genet Med 18:65-72
Zigler, Corwin Matthew (2016) The Central Role of Bayes' Theorem for Joint Estimation of Causal Effects and Propensity Scores. Am Stat 70:47-54

Showing the most recent 10 out of 136 publications