Deep learning has achieved tremendous successes in the past decade. Despite these empirical successes, the theoretical understanding of deep learning is still largely falling behind. There exists a huge gap between the empirical successes of deep learning and conventional optimization and machine learning theories. This project aims to bridge this gap by establishing the theoretical foundations of deep learning to understand why and how it works, and use this theory to develop new models and algorithms. The expected outcome of this project includes new theories and the state-of-the-art approaches for deep learning. The project will push the frontier of deep learning and train next-generation researchers and practitioners in artificial intelligence. Research demonstrations and lab tours will be given to K-12 school students by showing the wide range of applications of AI and their connection to society, to motivate them to pursue a STEM discipline.
This project consists of two synergistic research thrusts: (1) understanding the optimization dynamics of training algorithms such as stochastic gradient descent for deep learning models, and deriving algorithm-dependent generalization error bounds to assess their generalization performance; and (2) developing a new suite of faster training algorithms for deep learning, as well as principled neural architecture search algorithms guided by the generalization error bounds to design better neural network models. To evaluate the developed approaches, both theoretical analyses and extensive experimental evaluations will be performed on real-world benchmarks including but not limited to image classification and natural language processing. The open source software and course materials developed in this project will be made publicly available to the broader community, to help engineers and scientists better understand and apply deep learning.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.