In the past decade, deep learning has made astonishing breakthroughs in real-world applications, including for example, computer vision, natural language processing, speech recognition, healthcare, and robotics. Deep learning uses multiple layers of linear transformations followed by nonlinear activations to represent abstractions in the data. It is a common belief that deep neural networks are good at learning various geometric structures hidden in data sets, such as rich local regularities, global symmetries, or repetitive patterns. However, little theory has been established to explain the power of deep neural networks for analyzing complex data sets containing geometric structures. This project will develop the theoretical and computational foundations to better understand how deep neural networks exploit geometric structures in data sets and achieve outstanding performance. The results will provide new insights on developing new deep learning models and methodologies.

This project focuses on three sets of related but distinct problems. The first set focuses on efficient approximation of functions on low-dimensional manifolds using deep neural networks. Existing theories show that deep learning estimators converge to the true function extremely slowly in high dimensions. When the function is supported on a low dimensional manifold, the PIs plan to prove a fast convergence rate depending on the intrinsic dimension of the manifold. This project will make contributions in function approximation theory, error analysis in statistical regression and classification, and adaptive theory of deep learning. The second set of problems concerns estimation of probability distributions supported on a low-dimensional manifold by deep generative models. These models utilize two neural networks to minimize the Integral Probability Metric (IPM) between the estimator and the data distribution, over the class of distributions generated by a deep generator network. The function class in IPM is realized by a deep discriminator network. This project will design proper network architectures of the generator and the discriminator, and prove performance guarantees of deep generative models. The third set of problems will focus on efficiently computing the optimal transport between two probability distributions using deep neural networks. After reformulating the optimal transport problem as a min-max optimization problem parametrized by two neural networks, the PIs propose a primal dual stochastic gradient descent algorithm to solve it.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
2012652
Program Officer
Yuliya Gorb
Project Start
Project End
Budget Start
2020-09-01
Budget End
2023-08-31
Support Year
Fiscal Year
2020
Total Cost
$342,394
Indirect Cost
Name
Georgia Tech Research Corporation
Department
Type
DUNS #
City
Atlanta
State
GA
Country
United States
Zip Code
30332