Unprecedented advances in digital technology during the second half of the 20th century have produced a revolution that is transforming science, including health and biomedical research, by providing data of unprecedented complexity in volumes and at a rate that was previously unimaginable. Members of National Research Council's (NRC's) Committee on Massive Data Analysis concluded in their 2013 Frontiers of Massive Data Analysis report that the challenges associated with Big Data go far beyond the technical aspects of data management and emphasized that development of rigorous quantitative and statistical methods was crucial if we are to use these data to their advantage. In this application we describe an integrated program designed to provide students with training in the quantitative and computational skills and communication and interdisciplinary research skills-and their application-required for those students to become the next generation of leading Big Data scientists in health and biomedical research. At the Harvard TH Chan School of Public Health, we have made a substantial investment is addressing these challenges, including launching a new formal Master's Degree program in Computational Biology and Quantitative Genomics, revamping the curriculum in Biostatistics to include a greater emphasis on computational methods and Big Data, a proposal undergoing internal review to include computation as an area of core competency for our students, and the inclusion of Big Data analytics as a central focus of the School's ongoing capital campaign. We are requesting support for six pre-doctoral students who will emerge from the program with expertise in cutting-edge statistical and computational methods development, a thorough understanding of fundamental basic science, public health, and clinical science, and demonstrated skills in the application of those methods in a wide range of areas in health and biomedical research. Our students will participate in a program designed to provide them with interdisciplinary research experience, to train them to collaborate and communicate effectively, and to understand the importance of data provenance and reproducible research. The training program involves active participation by accomplished and experienced multidisciplinary faculty members, including biostatisticians, bioinformatics scientists and computational biologists, computer scientists, molecular biologists, public health researchers, and clinicians. It combines elements of training in coursework, lab rotations in biostatistics, computational biology, computer science, molecular biology, population science and clinical science. Students will participate in directed and independent methodological research, will be involved in broad-based collaborative research projects, and will have rich career development opportunities in a stimulating and nurturing interdisciplinary environment that will prepare them to be leaders in quantitative Big Data health science research.

Public Health Relevance

Unprecedented advances in digital technology during the second half of the 20th century have produced a revolution that is transforming science, including health and biomedical research, by providing data of unprecedented complexity in volumes and at a rate that was previously unimaginable. Members of National Research Council's (NRC's) Committee on Massive Data Analysis concluded in their 2013 'Frontiers of Massive Data Analysis' report that the challenges associated with 'Big Data' go far beyond the technical aspects of data management and emphasized that development of rigorous quantitative and statistical methods was crucial if we are to use these data to their advantage. In this application we describe an integrated program designed to provide students with training in the quantitative and computational skills-and their application-required for those students to become the next generation of leading Big Data scientists in health and biomedical research.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Institutional National Research Service Award (T32)
Project #
5T32LM012411-02
Application #
9248431
Study Section
Special Emphasis Panel (ZRG1-IMST-T (50)R)
Program Officer
Ye, Jane
Project Start
2016-04-01
Project End
2021-03-31
Budget Start
2017-04-01
Budget End
2018-03-31
Support Year
2
Fiscal Year
2017
Total Cost
$280,200
Indirect Cost
$13,644
Name
Harvard University
Department
Biostatistics & Other Math Sci
Type
Schools of Public Health
DUNS #
149617367
City
Boston
State
MA
Country
United States
Zip Code
02115
Hecker, Julian; Xu, Xin; Townes, F William et al. (2018) Family-based tests for associating haplotypes with general phenotype data: Improving the FBAT-haplotype algorithm. Genet Epidemiol 42:123-126
Valeri, Linda; Patterson-Lomba, Oscar; Gurmu, Yared et al. (2016) Predicting Subnational Ebola Virus Disease Epidemic Dynamics from Sociodemographic Indicators. PLoS One 11:e0163544