Probabilistic graphical models offer a mathematical framework to describe spatial and temporal relationships between entities. From some observable measurements, inferences can be made about other connected hidden factors. When applied to big datasets, however, standard inference methods are prohibitively time-consuming, requiring a large number of iterative updates to infer each variable. To avoid these updates, amortized inference uses a neural network to directly compute an approximate solution for every variable, which is much faster. Amortized inference works well for simpler models but not for models that capture more correlations between data. This project's goal is to generalize amortization to models that have dependent local connectivity between data. Findings from this project may be applicable to many useful large-scale applications, such as spatial modeling of bird sightings across North America and spatiotemporal modeling of opioid overdoses across neighborhood, state, and regional scales in the U.S. to inform more effective public health interventions. This research will further support the development of a project-based college-level course on probabilistic graphical models, as well as open-source software packages that make the project's new inference methods available to non-experts.

The project's technical contribution will bring the efficiencies of amortized inference from models that make strong conditional independence assumptions about data to a wider class of probabilistic graphical models that capture more realistic structured dependencies between data. Across three common inference algorithms -- structured Variational Inference (VI), Loopy Belief Propagation (LBP), and Expectation Propagation (EP) -- two key technical innovations will be applied: (1) decomposition of the optimization objective to allow scalable neighborhood-by-neighborhood processing, and (2) identification of reusable neighborhood substructure that can be fed into an amortizing neural network to produce approximate local posterior distributions. These innovations are challenging because most straightforward attempts would encounter an intractable entropy term (VI) or not maintain consistency between local marginal distributions of random variables (LBP or EP). These challenges may be met by developing improved bounds and parameterizations that enforce consistency, leading to amortized inference in which the number of parameters remains fixed and affordable even when applied to large graphical models that include millions of variables.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Project Start
Project End
Budget Start
2019-09-01
Budget End
2022-08-31
Support Year
Fiscal Year
2019
Total Cost
$399,923
Indirect Cost
Name
Tufts University
Department
Type
DUNS #
City
Boston
State
MA
Country
United States
Zip Code
02111