Northeastern University requests funds for a Summer School, entitled Big Data and Statistics for Bench Scientists. The target audience for the School are graduate and post-graduate life scientists, who work primarily in wet lab, and who generate large datasets. Unlike other educational efforts that emphasize genomic applications, this School targets scientists working with other experimental technologies. Mass spectrometry-based proteomics and metabolomics are our main focus, however the School is also appropriate for scientists working with other assays, e.g. nuclear magnetic resonance spectroscopy (NMR), protein arrays, etc. This large community has been traditionally under-served by educational efforts in computation and statistics. This proposal aims to fill this void. The Summer School is motivated by the feedback from smaller short courses previously co-organized or co- instructed by the PI, and will cover theoretical and practical aspects of design and analysis of large-scale experimental datasets. The Summer School will have a modular format, with 8 20-hour modules scheduled in 2 parallel tracks during 2 consecutive weeks. Each module can be taken independently. The planned modules are (1) Processing raw mass spectrometric data from proteomic experiments using Skyline, (2) Begnner's R, (3) Processing raw mass spectrometric data from metabolomic experiments using OpenMS, (4) Intermediate R, (5) Beginner's guide to statistical experimental design and group comparison, (6) Specialized statistical methods for detecting differentially abundant proteins and metabolites, (7) Statistical methods for discovery of biomarkers of disease, and (8) Introduction to systems biology and data integration. Each module will introduce the necessary statistical and computational methodology, and contain extensive practical hands-on sessions. Each module will be organized by instructors with extensive interdisciplinary teaching experience, and supported by several teaching assistants. We anticipate the participation of 104 scientists, each taking on average 2 modules. Funding is requested for three yearly offerings of the School, and includes funds to provide US participants with 62 travel fellowships per year, and 156 registration fee wavers per module. All the course materials, including videos of the lectures and of the practical sessions, will be publicly available free of charge.

Public Health Relevance

Northeastern University proposes to organize a Summer School `Big Data and Statistics for Bench Scientists'. The Summer School will train life scientists and computational scientists in designing and analyzing large-scale experiments relying on proteomics, metabolomics, and other high-throughput biomolecular assays. The training will enhance the effectiveness and reproducibility of biomedical research, such as discovery of diagnostic biomarkers for early diagnosis of disease, or prognostic biomarkers for predicting therapy response.

National Institute of Health (NIH)
National Institute of Biomedical Imaging and Bioengineering (NIBIB)
Education Projects (R25)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-BST-U (55)R)
Program Officer
Baird, Richard A
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Northeastern University
Schools of Arts and Sciences
United States
Zip Code