Massively parallel computers simulate data about molecular phenomena at previously unimaginable scales, satellites scan the planet capturing vast sets of measurements about ecosystem health, and particle accelerators generate tremendous amounts of data revealing fundamental properties of the smallest building blocks of matter; all with potentially broad societal benefits in areas such as drug discovery, energy conservation, and materials science. To fully realize these benefits will require a workforce with the technical skills to extract useful information from massive scientific data sets, calling for new approaches to graduate student training that emphasize expertise in data-driven science. This National Science Foundation Research Traineeship (NRT) award to the University of California Irvine (UCI) will tackle this challenge by creating a training ecosystem comprised of leading UCI, national-laboratory, and private-sector researchers across particle physics, earth science, chemistry, statistics and machine learning; all bound together by expertise in the emerging Science of Team Science. The project anticipates training over sixty (60) MS and PhD students, including twenty (20) funded trainees, from diverse backgrounds in computational statistics, machine learning, earth science, particle physics, synthetic chemistry, and team science. After graduation, students from this program will have both the technical and team-science skills to be leaders in the emerging field of data-driven science, and to participate in and lead interdisciplinary research teams at national laboratories, in academia, and in industry labs.

The research agenda of the program seeks to create the foundation from which bridges can be built between the traditional scientific route of building interpretable models based on physical principles and data-driven modeling approaches that can provide high fidelity predictions but may lack clear interpretability in terms of the underlying science. The program will involve a number of interrelated research themes across multiple disciplines in the information and physical sciences, including machine learning (e.g. temporal and spatial data modeling, multi-scale models, deep learning, and scalable learning algorithms), particle and astroparticle physics (e.g. accelerator based experiments), earth systems science (e.g. reducing ecosystem response prediction uncertainties), and chemistry (e.g. prediction of physical properties of small molecules). A significant aspect of the program is an emphasis on team science as a core theme. Students will collaborate in small interdisciplinary research teams consisting of students and faculty with different disciplinary skills, and will take part in team-science workshops leading to student-led development of a team-science certificate in years 3 to 5 of the program. Summer internships for student participants, at both national and industry research laboratories, will serve to reinforce the students' academic training via participation in large-scale interdisciplinary data science research projects.

The NSF Research Traineeship (NRT) Program is designed to encourage the development and implementation of bold, new potentially transformative models for STEM graduate education training. The Traineeship Track is dedicated to effective training of STEM graduate students in high priority interdisciplinary research areas, through the comprehensive traineeship model that is innovative, evidence-based, and aligned with changing workforce and research needs.

Agency
National Science Foundation (NSF)
Institute
Division of Graduate Education (DGE)
Type
Standard Grant (Standard)
Application #
1633631
Program Officer
Vinod Lohani
Project Start
Project End
Budget Start
2016-09-15
Budget End
2021-08-31
Support Year
Fiscal Year
2016
Total Cost
$2,967,150
Indirect Cost
Name
University of California Irvine
Department
Type
DUNS #
City
Irvine
State
CA
Country
United States
Zip Code
92697