This award supports TRIPODS@Duke Phase I, a project that will develop the foundations of data science both at Duke University and in the broader NC Research Triangle and surrounding region. A total of 25 faculty at Duke representing the disciplines of Computer Science, Electrical Engineering, Mathematics, and Statistical Science will be involved in Phase I. Activities include five semesters of workshops, with 3-4 one-week workshops each semester. These workshops will involve local and national participants and will bring experts on data science to the area. The project will support graduate students and postdoctoral trainees both in terms of education in the foundations of data science as well as in their professional development. Educational activities include the development and teaching of data science across curricula in Computer Science, Electrical and Computer Engineering, Mathematics, and Statistical Science, both at the undergraduate and graduate levels. The project will also leverage existing data science programs, including the Rhodes Information Initiative at Duke, a center for "big data" computational research and expanding opportunities for student engagement in data science; and the Statistical and Applied Mathematical Sciences Institute (SAMSI), one of the NSF/DMS-funded Mathematical Sciences Research Institutes (MSRIs), which is a partnership among Duke University, North Carolina State University (NCSU), and the University of North Carolina at Chapel Hill (UNC).
The topics of the signature workshops supported by the TRIPODS@Duke Phase I project are (1) scalable inference with uncertainty, (2) causal inference, (3) neural networks, (4) complex and dynamic image and signal processing, and (5) interpretable models. These five topics all fall under three research themes that require transdisciplinary collaborations among computer scientists, electrical engineers, mathematicians, and statisticians: Theme I: Scalable algorithms with uncertainty for data science; Theme II: Data science at the human-machine interface; and Theme III: Fundamental limits of data science. The potential research innovations for the three themes that will be developed and or advanced include: For Theme I, scalable Bayesian and generalized Bayesian inference, robust optimization for uncertain inputs, and algorithm and architecture design for neural networks; for Theme II, interpretable models and algorithms, causal inference with high-dimensional complex observational data, and image and signal processing for screening and monitoring; and for Theme III, robust optimization for uncertain inputs, statistical and approximation power of deep neural network architectures, and fundamental limits of causal inference in observational studies.
This project is part of the National Science Foundation's Harnessing the Data Revolution (HDR) Big Idea activity.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.