A fundamental challenge of this time is inference, that is the extraction of information from data. A particular challenge from the algorithmic perspective is the curse of dimensionality. This involves the exponential explosion of computational cost as the number of features of the data grows. The project will advance the understanding of data with many features and provide new tools for data summarization. It has a high potential of leading to the development of new practical algorithms to extract information from data. The main educational outcome of the project will be to train PhD students in theoretical computer science and to provide research experience and mentoring to undergraduate students. The project will enrich the experience of PhD students by funding their attendance to prestigious workshops on the foundations of data science and other aspects of theoretical computer science.

The project will design new tools for efficient inference and solidify understanding of existing tools. The focus will be on tensor methods, deconvolution and convex optimization. The underlying theme is the use of ideas from high-dimensional probability, and discrete and convex geometry. The rationale for these choices is twofold. Firstly, the research team is highly qualified in those fields. Secondly, those foundational fields provide some of the most powerful tools available to understand high-dimensional data. Within the broader topics of the project, this project will focus on (1) Frank-Wolfe methods in optimization, (2) blind and non-blind deconvolution and the interplay with tensor methods and parameter estimation for Gaussian mixture models, and (3) geometric tools to describe and summarize the shape of data. The project will provide insight into well-recognized challenges. While the questions are challenging, in recent years there has been steady progress on them, such as new insights into tensor methods. A specific intellectual opportunity is the identification of the right tools in high-dimensional probability and discrete and convex geometry that provide insight into the design of new algorithms for inference and data analysis. The proposed work on the foundations of data science and optimization has connections with random polytopes, random matrices, high-dimensional probability and discrete and convex geometry. The proposed work is expected to provide more instances of "cross-pollination," where theoretical tools can aid the design of new algorithms and the algorithmic approach provides new questions and motivations to theoretical fields.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Project Start
Project End
Budget Start
2020-08-01
Budget End
2023-07-31
Support Year
Fiscal Year
2020
Total Cost
$450,000
Indirect Cost
Name
University of California Davis
Department
Type
DUNS #
City
Davis
State
CA
Country
United States
Zip Code
95618