Research computing and the linkage, management, and storage of very large databases, are now an essential part of cancer research. Analysis of high dimensional data for cancer research has become highly interdisciplinary and the access to specialized fields (e.g. informatics, statistical programming, computational biology. Geographic Information Systems, system administration, and data base management) is of paramount importance. While the price of computation and storage has steadily decreased, the cost of highly specialized research computing expertise and administration of large data sets has increased and will continue to increase. In addition, the large number of requirements for research data security calls for a centralized infrastructure that can provide access to secure data sets and, at the same time, enormous computational power. The overall purpose of the Statistical Computing Core (SCC) is to ensure that all Program investigators have access to linked and securely stored datasets and cutting edge statistical computing support for the three proposed research projects.
The specific aims are: 1) Provide access to state of the art research computing expertise and infrastructure (Aim 1);2) Develop a comprehensive data management structure for the Program Project (Aim 2);3) Provide access to specialized expertise such as GIS, bioinformatics;and statistical programming (Aim 3);and 4) Support and disseminate software to the research community and maintain the Program Project website (Aim 4).

Public Health Relevance

This Program Projects aims to analyze large and complex data from observational studies in cancer research with the ultimate goal of advancing cancer prevention and treatment strategies and reducing cancer burden in the US. The Statistical and Computing Core provides data and computational support and. leadership to assist Program Investigators to achieve this mission.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Research Program Projects (P01)
Project #
Application #
Study Section
Special Emphasis Panel (ZCA1-RPRB-2)
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Harvard University
United States
Zip Code
Bobb, Jennifer F; Claus Henn, Birgit; Valeri, Linda et al. (2018) Statistical software for analyzing the health effects of multiple concurrent exposures via Bayesian kernel machine regression. Environ Health 17:67
Chen, Han; Cade, Brian E; Gleason, Kevin J et al. (2018) Multiethnic Meta-Analysis Identifies RAI1 as a Possible Obstructive Sleep Apnea-related Quantitative Trait Locus in Men. Am J Respir Cell Mol Biol 58:391-401
Pierce, Brandon L; Kraft, Peter; Zhang, Chenan (2018) Mendelian randomization studies of cancer risk: a literature review. Curr Epidemiol Rep 5:184-196
Barfield, Richard; Feng, Helian; Gusev, Alexander et al. (2018) Transcriptome-wide association studies accounting for colocalization using Egger regression. Genet Epidemiol 42:418-433
Liu, Zhonghua; Lin, Xihong (2018) Multiple phenotype association tests using summary statistics in genome-wide association studies. Biometrics 74:165-175
Emilsson, Louise; García-Albéniz, Xabier; Logan, Roger W et al. (2018) Examining Bias in Studies of Statin Treatment and Survival in Patients With Cancer. JAMA Oncol 4:63-70
Sun, Ryan; Carroll, Raymond J; Christiani, David C et al. (2018) Testing for gene-environment interaction under exposure misspecification. Biometrics 74:653-662
Antonelli, Joseph; Cefalu, Matthew; Palmer, Nathan et al. (2018) Doubly robust matching estimators for high dimensional confounding adjustment. Biometrics :
Wilson, Ander; Zigler, Corwin M; Patel, Chirag J et al. (2018) Model-averaged confounder adjustment for estimating multivariate exposure effects with linear regression. Biometrics 74:1034-1044
Makar, Maggie; Antonelli, Joseph; Di, Qian et al. (2017) Estimating the Causal Effect of Low Levels of Fine Particulate Matter on Hospitalization. Epidemiology 28:627-634

Showing the most recent 10 out of 192 publications