The project at UC Davis will establish the UC Davis TETRAPODS Institute of Data Science (UCD4IDS), which will be composed of thirty-five researchers (four PIs and thirty-one senior personnel) coming from four departments (Computer Science, Electrical & Computer Engineering, Mathematics, and Statistics) and will break interdepartmental barriers and promote interdisciplinary research collaborations among faculty members, postdocs, and graduate students. The project will encourage innovative and robust research, and provide education and mentoring of graduate students and postdocs in data science. Students and postdocs engaged in this project will be trained to be the next generation of interdisciplinary data scientists: they will gain deep knowledge of some focused areas, and at the same time, broaden their perspectives in other diverse fields. The UCD4IDS will bring in the insights gained by the experience of the faculty members in the four primary departments as well as application fields such as neuroscience, medical and health sciences, and veterinary medicine. The UCD4IDS will organize: a) round-table discussions and breakout sessions after weekly seminars related to data science; b) quarterly colloquia on data science; and c) annual three-day workshops. The project will also coordinate and develop diverse courses at UC Davis, with graduate students involved in the project taking at least one course in each of the four departments. The PI team will also leverage local programs to recruit, support, and retain graduate students, postdocs, and new faculty members from underrepresented groups by matching them to appropriate mentors. For the dissemination of the research and educational results, the PI team plans to: 1) make colloquia and workshop talk slides, lecture notes, and codes available online, which will reach out to our current and future collaborators and the general public; and 2) organize mini-symposia and workshops on foundations of data science at targeted conferences.

Research at the UCD4IDS will focus on three broad themes: 1) Fundamentals of machine learning directed toward biological and medical applications; 2) Optimization theory and algorithms for machine learning including numerical solvers for large-scale nontrivial learning problems; and 3) High-dimensional data analysis on graphs and networks. The algorithms and software tools to be developed will make a positive impact in solving practical data-analysis and machine-learning problems in diverse fields, e.g., computer science (analyzing friendship relations in social networks); electrical engineering (monitoring and controlling sensor networks); civil engineering (monitoring traffic flow on a road network); and in particular, biology and medicine (analyzing data measured on real neural networks, detecting changes in the brain structures due to diseases, imaging live biological cells for analyzing their growth, etc.). The technical goals of this project are: 1) geometric understanding of high-dimensional data, which may allow efficient (re)sampling from manifolds representing certain phenomena of interest and classifying subtle yet critical differences that often appear in biological and medical applications; 2) providing theoretical guarantees and efficient numerical algorithms for non-convex optimization, which is crucial to machine learning; and 3) deepening understanding of how local interactions between individual entities (e.g., neurons) lead to global coordination and decision making.

This project is part of the National Science Foundation's Harnessing the Data Revolution (HDR) Big Idea activity.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
1934568
Program Officer
Zhengdao Wang
Project Start
Project End
Budget Start
2019-10-01
Budget End
2022-09-30
Support Year
Fiscal Year
2019
Total Cost
$1,000,000
Indirect Cost
Name
University of California Davis
Department
Type
DUNS #
City
Davis
State
CA
Country
United States
Zip Code
95618