The increasing volume of big data in cancer research has the potential to dramatically accelerate the translation of knowledge from bench to bedside. Unfortunately, most cancer researchers are unable to: (i) utilize the valuable big data that is readily available in the public domain, and (ii) extract knowledge from cancer big data through communicating with computer scientists, statisticians and bioinformaticians. Traditionally, cancer researchers are trained in the biologically related sciences that are relevant to the manifestation of the disease. This knowledge is, and remains, critical for understanding the biological and molecular mechanisms that result in the disease and that can be targeted for clinical intervention. However, historically, cancer researchers have not been trained to handle large volumes of data. There was no need; there were not many approaches that were generating large scale data. Yet, with the advent of high-throughput approaches, in particular those related to genomics, proteomics and metabolomics, a significant gap in the training of cancer researchers has become apparent ? the need for skills in computer science and statistics to analyze big data and interpret results from the analyses. In the absence of quantitative training for cancer researchers, a bottleneck will remain in the translation of the large body of cancer big data to clinical practice. This need was confirmed in a needs assessment of researchers from 95 Cancer Centers sent out last year (including all 69 NCI-Designated Cancer Centers). To address the need for a big data training course, the investigators propose to build on a previously NIH-funded big data training course, to develop and deliver a new training course tailored to cancer researchers across the country. In a partnership between the Purdue University Center for Cancer Research (PCCR), the Indiana University Simon Cancer Center (IUSCC), and a group of traditionally trained biostatisticians, the team is in a unique position to leverage basic and clinical cancer centers (the only two NCI-Designated Cancer Centers in the State), to work together on this multi-disciplinary training program. In contrast to the previous successful big data training course designed for general biomedical researchers who were novices in big data science, this new course will target cancer researchers with the knowledge of big data value but lacking the quantitative skills necessary to work with it. Based on case studies from both PCCR and IUSCC researchers, the goal of the course is to help participants develop skills for managing, visualizing, analyzing, and integrating various types of cancer big data that are publicly available. This is increasingly important as more and more precision oncology- focused treatments are coming on line. With this customized big data training, cancer researchers can realize the transformative potential of big data by translating it from bench to bedside.

Public Health Relevance

Cancer big data is a collection of high-density information from a variety of sources, and it requires sophisticated statistical and computational tools to analyze and extract knowledge. Built on a previous big data training program designed for general biomedical researchers, this new training program is aimed at helping cancer researchers develop the necessary analytical skills to work with big data, and translate big data into knowledge to facilitate precision cancer medicine.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Education Projects (R25)
Project #
Application #
Study Section
Subcommittee I - Transistion to Independence (NCI)
Program Officer
Korczak, Jeannette F
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Purdue University
Biostatistics & Other Math Sci
Schools of Arts and Sciences
West Lafayette
United States
Zip Code