One of the key tenets taught in courses on Statistics and Machine Learning is that data interpolation (or, data memorization) inevitably leads to overfitting and poor prediction performance. Yet, most of the modern large-scale models, including over-parametrized neural networks, are routinely optimized to achieve zero error on training data. The research objective of this project is to challenge the common wisdom and develop theoretical and algorithmic foundations for methods that interpolate the training data.

The project will focus on the statistical and computational aspects of interpolation methods. Consistency and finite-sample bounds will be derived for regression and classification methods in the interpolation regime, and information-theoretic limits of interpolating rules will be developed. The project will also focus on the computational aspects of interpolation. The PI aims to shed light on the relative advantages and disadvantages of over-parametrized models that have capacity to perfectly fit the data.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
1953181
Program Officer
Yong Zeng
Project Start
Project End
Budget Start
2020-06-01
Budget End
2023-05-31
Support Year
Fiscal Year
2019
Total Cost
$66,667
Indirect Cost
Name
Massachusetts Institute of Technology
Department
Type
DUNS #
City
Cambridge
State
MA
Country
United States
Zip Code
02139