This project aims to investigate two kinds of problems in theoretical computer science and information theory. The first is about Hardness Amplification. Here the goal is to find ways to manipulate functions that are hard to compute in some computational model to obtain new functions that are significantly harder to compute. For example, in communication complexity, one might expect that computing k copies of a functionality should take k times the communication, but to date we do not know how to prove this. Prior work has shown that the communication must grow by a factor of roughly the square root of k, but it remains to be seen whether this can be increased to a factor of k. Another domain where this kind of problem makes sense is for streaming algorithms. In a streaming algorithm, the input arrives as a massive data stream that cannot be stored. The goal is to compute a function of the data using as little memory as possible. One might expect that handling k independent streams of data in parallel should require k times the memory, yet we do not know how to prove this. Similar questions can be asked about amplifying the hardness of approximation algorithms, and showing that composing a function with itself increases its circuit depth. This kind of question is related to proving lowerbounds, a central goal of theoretical computer science.

The second kind of problem is about compression. Today we have a good understanding of how single messages can be compressed so that the number of bits it takes to represent them is more or less equal to the information that they carry. Here the information can be measured using Shannon's Entropy function, or in the case of randomized transmissions, the mutual information between the inputs and the messages. However, if we have an interactive communication process between several parties, it is not clear how to reduce the communication in the interaction so that the communication is close to the amount of information conveyed between the parties. This problem is closely related to the problem of amplifying the communication complexity of a function, discussed above. Prior work has shown how to reduce the communication of a protocol that conveys small information, and such a compression scheme turns out to be useful to prove that computing many copies of a function must require larger communication. Indeed, an optimal compression scheme would give an optimal result in the setting of hardness amplification. Finding such a scheme is a major goal of this project. This project also aims to study the compression of memory used by streaming algorithms. Given a streaming algorithm, can we always reduce the memory usage of the algorithm until the number of bits used is close to the amount of information stored by the algorithm. In this setting, it is not clear what the right measure of information should be, and defining a meaningful measure of the information is another goal.

The common theme tying the problems of this proposal together is an information theory based method that is applicable to these problems.

Project Report

In the past year, we worked on several projects related to using methods from information theory to prove lower bounds on communication complexity and other computational models. In addition, the PI co-organized a workshop at the Banff International Research Station, gathering experts to have a workshop on "Communication Complexity and its Applications" for the first time in a couple of decades. One of the major goals of theoretical computer science is to come up with ways to prove that known algorithms are optimal. Unfortunately, despite much effort we know of no way to do this today, besides arguing that linear time algorithms are optimal because they must necessarily read all of their input. We invested a lot of effort in the last year and a half on making progress on this issue. We defined a new natural model of computation using circuits. These circuits are allowed to have gates with fan-in up to 2n/3, where n is the number of bits in the input. Each gate may compute an arbitrary function of its input. We showed that proving lower bounds on this model of computation would give new size depth tradeoffs for boolean circuits, a major open problem that has remained unresolved for about 40 years. Specifically, it would give a proof that there is no linear sized logarithmic (in n) depth circuit for computing some explicit function of n bits. The ultimate goal is to prove that there is no linear sized circuit for computing some explicit function, but we don't even know how to prove a lower bound when the depth is restricted to being logarithmic. Our work defined a model of communication that can simulate such circuits, and so enables one to attack the lower bound problem using methods from communication complexity. The model is as follows: there are k = n/100 players. Each player can choose to see some n/3 of the input bits. One needs to argue that if each player can send only one bit, then they cannot compute the function. We studied the problem of compressing communication between two parties. Despite much effort, the best result in this direction is still obtained in our work with Barak, et. al., where we showed that any protocol that reveals I bits of information and has C bits of communication can be simulated with a protocol with communication roughly sqrt(IC). We studied the case where the information is asymmetric. Namely, one party learns much more information than the other during this protocol. We showed that if one party learns I_A bits and the learns I_B bits (so the total information is I = I_A+I_B, then the protocol can be compressed to get communication proportional to roughly (I_A)^{1/4} C^{3/4} + I_A+I_B, beating the earlier result for certain values of I_A, I_B. In work with faculty from Technion, we studied the Number-on-forehead model of communication complexity. Here k parties each have an input on their forehead, and wish to compute some function of all inputs. Each party can only see the others' inputs. A classical problem (with many applications) that has been studied is the problem of computing set disjointness. The parties each have subsets of {1,2,...,n} on their forehead and want to know whether the intersection of all the sets is empty or not. A long sequence of works has attempted to prove lower bounds on this model, culminating (prior to our work) on a lower bound of sqrt(n)/2^k for this problem. In our work we improved the lower bound to n/4^k, which is almost optimal. We studied extending the information theory methods we have developed to the setting of streaming algorithms. We defined a measure of information for streaming algorithms and proved several results about compressing algorithms.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
1016565
Program Officer
Dmitry Maslov
Project Start
Project End
Budget Start
2010-09-01
Budget End
2014-08-31
Support Year
Fiscal Year
2010
Total Cost
$406,410
Indirect Cost
Name
University of Washington
Department
Type
DUNS #
City
Seattle
State
WA
Country
United States
Zip Code
98195