In many practical applications, one often needs to provide solutions to quantities of interest to a large-scale problem but with only subsampled data and partial information of the physical model. Existing computational solvers cannot be used directly for this purpose. On the other hand, many powerful techniques have been developed in data science to represent and compress data for useful information with extreme efficiency and low computational complexities. A crucial factor for the success of these methods is to exploit some special features in these high-dimensional data. The purpose of this project is to integrate physical models with data science to develop a new generation of computational methods that can solve large-scale physical or data science problems using only subsampled data and partial knowledge of the physical model. The mathematical analysis will help reveal certain important solution structures so that one can use techniques from data science to give accurate approximations for those quantities of interest. Without identifying these special solution structures and using the physical model as a constraint, the current techniques from data science cannot be used directly to achieve PI's goal. This project can have a substantial impact for the computational science and data science communities, for national technology and society. Additional impact will be the involvement of graduate students. This research provides a solid training in mathematical analysis, physical modeling, and data science. The interdisciplinary training they receive in this project will be very important for their future careers in mathematics and science.

The recent advances in data science offer tremendous opportunities for computational sciences. A key to the success in data science is to exploit some special features in the high-dimensional data. Traditional PDE solvers have not taken full advantage of the special solution structures. PDE analysis and data science complement each other. PDE analysis can identify some important solution structures that can help the PI to design a more effective deep generative network to solve the physical problem. Without the guidance from the PDE analysis, naive application of current machine learning algorithms to multiscale problems would fail. The solution of the nonconvex optimization problem can easily get stuck in local minimum and may converge to the wrong solution. The PI will identify some key ingredients that would make such integration successful, investigate what type of PDEs can be compressed and what algorithms can be used to approximate quantities of interest with a small percentage of subsampled data and partial knowledge of the physical model. This research will also provide valuable theoretical understanding of some deep learning methods for solving multiscale problems. The PI will consider both inverse and forward problems. For the forward problem, he will develop a novel multiscale method based on subsampled data to reconstruct the solution with guaranteed accuracy. For the inverse problem, the PI will post it as a Bayesian inverse problem and use Deep Generative Networks. An essential ingredient in this approach is to introduce a novel multiscale invertible flow to approximate the transport map, which enables the PI to develop an efficient sampling algorithm to capture the multiple modes in the posterior distribution.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
1912654
Program Officer
Yuliya Gorb
Project Start
Project End
Budget Start
2019-09-01
Budget End
2022-08-31
Support Year
Fiscal Year
2019
Total Cost
$250,000
Indirect Cost
Name
California Institute of Technology
Department
Type
DUNS #
City
Pasadena
State
CA
Country
United States
Zip Code
91125