The advancement of deep learning, the technique of training artificial neural networks to make predictions, has led to recent breakthroughs in many areas of artificial intelligence, such as computer vision, natural language understanding, and robotics. A major challenge in deep learning is ensuring accurate predictions on unseen scenarios. This project plans to tackle this challenge via theoretical analysis and its empirical evaluation. The project aims to contribute to the fundamental understanding of deep learning and inform the practical advancement of deep learning, improving its reliability, efficiency, and risk management in data-hungry and risk-sensitive applications. An education plan is integrated into this project --- the investigator will develop new courses, mentor students, organize workshops, and work with high-school teachers on developing high-school AI courses.

The project aims to build a comprehensive generalization theory for deep neural networks, which covers the technical question of implicit regularization effect and the broad concepts of out-of-domain generalization and the estimation of generalization errors. This project has three major components. The first thrust is to characterize the optimizers’ implicit regularization effect for complex models. Leveraging the theoretical insights, the investigator will make implicit regularization more explicit, stronger, and customizable to datasets to improve generalization. The second thrust is to theoretically study the out-of-domain generalization in settings with an increasing level of differences between the training and test environments by a growing level of exploitation of unlabeled data and their properties. Finally, the PI will study estimating the generalization errors, which is crucial for quantifying the risk before deploying machine learning models in risk-sensitive applications such as healthcare.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
2045685
Program Officer
Rebecca Hwa
Project Start
Project End
Budget Start
2021-03-01
Budget End
2026-02-28
Support Year
Fiscal Year
2020
Total Cost
$105,351
Indirect Cost
Name
Stanford University
Department
Type
DUNS #
City
Stanford
State
CA
Country
United States
Zip Code
94305