Methods for large-scale machine learning and artificial intelligence (AI) have had major impacts on the world over the past decade, including in both industrial and scientific contexts. These spectacular successes are driven by a combination of the availability of massive datasets, and appropriate models and algorithms for extracting useful information and insights from these datasets. This research project aims to advance the methodology and understanding of algorithms for large-scale machine learning and AI by exploiting the interplay between sampling and optimization. In particular, two grand challenges are addressed: first, the tools and insights of optimization theory can develop more effective design and analysis techniques for sampling methods; second, these techniques can be used to design and analyze optimization methods for problems such as those that arise in deep learning. Successful research outcomes of this project are likely to increase the understanding of methods used for sampling and for optimization, and to facilitate their principled design. Successful outcomes have a significant potential for practical impact in the large and growing set of applications where large-scale sampling and optimization methods are used, including computer vision, speech recognition, and self-driving cars. The research will support the development of graduate students, will be disseminated through large graduate courses at Berkeley and their web-based course materials, and has the potential to benefit the broader community through the application of the methods studied in deployed AI systems.

The project has three main technical directions. First, it aims to identify the inherent difficulty of sampling problems by proving lower bounds. Second, it aims to produce analysis tools and design methodologies for sampling algorithms based on a certain family of stochastic differential equations known as a Langevin diffusion. This will enable the development of sampling algorithms with performance guarantees. Third, it will use the viewpoint of sampling techniques to analyze and design stochastic gradient methods for nonconvex optimization problems, such as the optimization of parameters in deep neural networks. An additional outcome of the project will be the organization of a workshop on the topic of the interface between sampling and optimization.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
1909365
Program Officer
Rebecca Hwa
Project Start
Project End
Budget Start
2019-08-01
Budget End
2022-07-31
Support Year
Fiscal Year
2019
Total Cost
$450,000
Indirect Cost
Name
University of California Berkeley
Department
Type
DUNS #
City
Berkeley
State
CA
Country
United States
Zip Code
94710