This project focuses on the computational complexity of problems in semi-algebraic inference and in massive data set processing, as well as on advancing the analytical techniques of multiparty communication complexity, which has been a useful tool for understanding problems in both of these areas.

Semi-algebraic inference systems express sets of constraints using polynomial inequalities and use rules of inference to derive new polynomial inequalities from existing ones. They are widely used in operations research and combinatorial optimization. Part of this project is to investigate what can be derived efficiently using semi-algebraic inference, including the complexity of proofs of unsatisfiability based on semi-algebraic inference, the quality of the linear and semi-definite programming approximations for NP-hard optimization problems that can derived using known semi-algebraic inference, and the extent to which our understanding of inference over binary domains can be leveraged to yield better algorithms for classes of semi-algebraic inference over the real numbers.

Networked computers have allowed the accumulation of data sets of unprecedented size. New distributed methods that process such massive data sets efficiently by giving only approximate answers are having substantial impact in practice. However, the theoretical models of massive data set processing that have been analyzed to date represent only a very limited class of methods and do not capture techniques already used in practice. Part of this research is aimed at producing and analyzing theoretical models that will allow us to understand the limitations of these methods for massive data set processing and to make better use of them.

Multiparty communication complexity measures the amount of information that must be communicated between multiple participants, each having partial information about the inputs to a function, in order to compute its output. Because of its flexibility it is widely applicable to problems in computational complexity. The final goal of this project is to add to the rather limited set of techniques that can be used to analyze multiparty communication complexity with the aim goal of improving its applications to semi-algebraic inference and distributed massive data set processing.

Project Report

The research supported by this grant developed new methods for analyzing the complexity of semi-algebraic inference problems. These are problems in which constraints are represented by multivariate polynomial inequalities, and there are inference rules to derive new constraints from a given constraint. The research focused on the complexity of determining whether given sets of contraints can be simultaneously satisfied given such methods for inference. The length of the derivation, given some bound on the degree of the polynomials involved, is a natural measure of the complexity of such inference.Specific systems for low degree inference are the best methods currently known for finding approximations to NP-hard optimization problems involving constraint satisfaction. This research developed new general methods for showing that low degree semi-algebraic inference problems require large complexity. These methods are based on an approach for taking problems that are of high complexity for much simpler inference methods and modifying them so that they are hard for semi-algebraic inference. This research also analyzed a relatively recent model of how massive data can be processed, the read/write streams data processing model. This is an extension of a much simpler one-pass data stream model that has been very important for high-speed data analysis. The read/write streams model also models the ability of the high-speed processing to store data streams and recombine these parallel streams sequentially at high speed. This analysis showed that for a variety of important problems of estimating statistics on the input data, read/write streams did not provide significant additional power relative to the simpler one-pass data streams. Both the research on semi-algebraic inference and read/write streams relied on relating these problems to multiparty communication complexity. Multiparty communication complexity studies the amount of communication (but not computation) required by cooperating agents to solve problems related to their shared inputs. The research supported by this award also developed a new method for analyzing multiparty communication complexity. One consequence of this new method was a new lower bound on the complexity required for small depth circuits to compute simple functions. The research supported by this award also produced new methods for analyzing the tradeoffs between the time and storage space required to solve a variety of computational and inference problems. It also produced new lower bounds on the number of input queries a quantum computer would require to compute simple functions of its input. This award provided extensive research support and training for two graduate students, one of whom graduated with a PhD in 2011 and the other of whom is expected to graduate with a PhD in the first half of 2013.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
0830626
Program Officer
Dmitry Maslov
Project Start
Project End
Budget Start
2008-08-01
Budget End
2012-07-31
Support Year
Fiscal Year
2008
Total Cost
$497,377
Indirect Cost
Name
University of Washington
Department
Type
DUNS #
City
Seattle
State
WA
Country
United States
Zip Code
98195