Efficiently estimating integrals of high-dimensional functions is a fundamental and largely unsolved computational problem, manifesting in scientific areas from biology and physics to economics. In particular, in Artificial Intelligence and Machine Learning, a wide array of methods are computationally limited precisely because they require the computation of high-dimensional integrals. While computing such integrals exactly is highly intractable, approximations suffice for many applications. Currently, approximation is attempted using two main classes of algorithms: Markov Chain Monte Carlo (MCMC) sampling methods and variational inference techniques. The former are asymptotically accurate, but their computational budget is inflexible and often prohibitive. The latter have manageable computational budget, but typically come with no accuracy guarantees. This project will investigate a new family of computationally efficient approximation methods which reduce the task of integration to the much better studied task of optimization, thus leveraging decades of research and engineering in combinatorial optimization methods and technology. A key goal of the project is to develop an open-source software library of efficient tools for high-dimensional integration.

The reduction of integration to optimization builds on the probabilistic reduction of decision problems to uniqueness promise problems developed in the mid-80s. Specifically, the idea is to use systems of random parity equations in order to specify random subsets of the function's domain, and relate integration to the task of optimization over these subsets. In general, the capacity for efficient optimization fundamentally stems from the capacity to summarily dispense large parts of the domain as uninteresting. The key question to be addressed by the project is whether it is possible to define random subsets over which optimization is both tractable and informative for integration. To that end, the project will employ random systems of linear equations corresponding to Low Density Parity Check (LDPC) matrices for error-correcting codes. The energy landscape, i.e., the number of violated equations, of such systems is far smoother than that of the generic (dense) random systems of linear equations that underlie the original mid-80s technique, thus being far more amenable to optimization. The project will also build upon the deep understanding gained in the last two decades for LDPC codes in the field of communications, with the goal of integrating a priori knowledge about the energy landscape in the optimization strategy. This will provide a fundamentally new use for error-correcting codes, creating a bridge between the areas of optimization and information theory.

Project Start
Project End
Budget Start
2017-09-01
Budget End
2021-08-31
Support Year
Fiscal Year
2017
Total Cost
$360,000
Indirect Cost
Name
Stanford University
Department
Type
DUNS #
City
Stanford
State
CA
Country
United States
Zip Code
94305