Probabilistic inference allows humans to gain insight and make predictions from data. There is an ever-growing need in business, government, and science to answer questions using data. Often, these questions are best answered by phrasing them as probability calculations. As data sets grow larger and more complex, probability calculations are increasingly difficult and often cannot be performed exactly within reasonable time budgets. This project will promote science and technology by providing new theoretical results, algorithms, and empirical knowledge about how to compute approximate answers to probabilistic queries in a way that achieves good tradeoffs between accuracy and efficiency. In particular, the project will study how to best combine the strengths of two different strategies for calculating probabilities. This work will provide new techniques that are practical, have tunable accuracy, and scale to very large data sets.

To meet these goals, this project will combine two different approaches to probabilistic inference: variational inference (VI), and Monte Carlo (MC). MC algorithms are general-purpose and are asymptotically exact, but may fail to give good answers in reasonable time or scale large data sets. In contrast, VI is a way to get a "pretty good answer, quickly" by restricting the approximate posterior to tractable family. This project will combine these in a principled way to derive algorithms that are general-purpose, practical, have tunable accuracy, and scale to very large data sets. The new algorithms are expected to achieve time-accuracy tradeoffs that dominate Monte Carlo methods for a wide range of problems and time budgets. The proposed methods will 1) incorporate strengths of Monte Carlo methods into variational inference by designing approximating families based on Monte Carlo estimators; and 2) improve the usefulness of variational inference for downstream tasks by adapting divergences and approximating families to the needs of a downstream Monte Carlo estimator. The project will result in a comprehensive evaluation benchmark as well as a set of practical techniques to make the method more effective. A novel application in ecology will demonstrate the project's real-world potential.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Project Start
Project End
Budget Start
2019-09-01
Budget End
2022-08-31
Support Year
Fiscal Year
2019
Total Cost
$449,623
Indirect Cost
Name
University of Massachusetts Amherst
Department
Type
DUNS #
City
Hadley
State
MA
Country
United States
Zip Code
01035