Statistical models provide a powerful means of quantifying uncertainty, modeling prior beliefs, and describing complex dependencies in data. The process of using a model to answer specific questions, such as inferring the state of several random variables given evidence observed about others, is called probabilistic inference. Probabilistic graphical models, a type of statistical model, are often used in diverse applications such as medical diagnosis, understanding protein and gene regulatory networks, computer vision, and language understanding. On account of the central role played by probabilistic graphical models in a wide range of automated reasoning applications, designing efficient algorithms for probabilistic inference is a fundamental problem in artificial intelligence and machine learning.
Probabilistic inference in many of these applications corresponds to a complex combinatorial optimization problem that at first glance appears to be extremely difficult to solve. However, practitioners have made significant strides in designing heuristic algorithms to perform real-world inference accurately and efficiently. This project focuses on bridging the gap between theory and practice for probabilistic inference problems in large-scale machine learning systems. The PIs will identify structural properties and methods of analysis that differentiate real-world instances from worst-case instances used to show NP-hardness, and will design efficient algorithms with provable guarantees that would apply to most real-world instances. The project will also study why heuristics like linear programming and other convex relaxations are so successful on real-world instances. The efficient algorithms for probabilistic inference developed as part of this project have the potential to be transformative in machine learning, statistics, and more applied areas like computer vision, social networks and computational biology. To help disseminate the research and foster new collaborations, a series of workshops will be organized bringing together the theoretical computer science and machine learning communities. Additionally, undergraduate curricula will be developed that use machine learning to introduce students to concepts in theoretical computer science.