Scientists often use mathematical models to predict the behavior of natural and engineered systems. These models are therefore fundamental to scientific and engineering progress and hence relevant to NSF's science mission. Most models of realistic physical systems use complex formulae (such as, partial differential equations) involving many variables. When using such a model for predicting the future behavior of a system, a scientist has to provide initial values for all the variables. This can be difficult because input values may not be directly measureable. Thus, scientists often must use "inverse" computations to calculate the initial input values of the variables of a system model based on external observations of the real world. In other words, scientists seek to infer inputs to a computer model of a physical process from real observational data of the outputs. There are many examples of inverse computations, ranging from computing the important dimensions of an organ from its CAT scan, reconstructing the source of a sound by measuring its volume and frequency at various places, calculating the density of the Earth from measurements of its gravity field, or calculating the initial condition of the atmosphere (temperature, pressure, etc.) from satellite and weather station observations over a time interval. Inverse problems are ubiquitous across all of science and engineering (and beyond). Many solutions exist for inverse problems, i.e. solutions that fit the data to the observations. However, there are variations in the solutions identified. That is, the solutions of an inverse problem are subject to uncertainty. Bayesian inferencing provides a systematic mathematical framework for characterizing this uncertainty. However, the Bayesian solution of inverse problems for large-scale complex models require enormous computational power. Only recently have algorithms begun to emerge that are computationally tractable. However, these algorithms have remained out of the reach of the mainstream of scientists who solve inverse problems, due to their complexity and the need for deeper information from the forward model. This project aims to develop, distribute, and support open-source software that encodes state-of-the-art algorithms for the solution of large-scale complex Bayesian inverse problems and is robust, scalable, flexible, modular, widely accessible, and easy to use.

The project builds heavily on two complementary open-source software libraries the team has been developing: MUQ at MIT, and hIPPYlib at UT-Austin/UC-Merced. MUQ provides a spectrum of powerful Bayesian inversion models and algorithms, but expects forward models to come equipped with gradients/Hessians to permit large-scale solution. hIPPYlib implements powerful large-scale gradient/Hessian-based inverse solvers in an environment that can automatically generate needed derivatives, but it lacks full Bayesian capabilities. By integrating these two complementary libraries, the project will result in a robust, scalable, and efficient software framework that realizes the benefits of each to tackle complex large-scale Bayesian inverse problems across a broad spectrum of scientific and engineering disciplines. The resulting software, that will be distributed under an open-source license, will provide an environment for rapid development of inverse models equipped with gradient/Hessian information; benchmark problems for evaluation and comparison of algorithms; and tutorial problems for training and testing purposes.

National Science Foundation (NSF)
Division of Advanced CyberInfrastructure (ACI)
Standard Grant (Standard)
Application #
Program Officer
Seung-Jong Park
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Texas Austin
United States
Zip Code