Randomness extraction is an algorithmical process that transforms objects with low-quality randomness into objects with high-quality randomness. The objects (usually called sources) can be finite probability distributions, finite binary strings, and infinite binary sequences, and the randomness quality is measured, respectively, by min-entropy, Kolmogorov complexity, and constructive Hausdorff dimension. There exists a vast amount of work on randomness extractions in all three settings. Our project will extend the field by investigating randomness extraction in situations that go beyond some of the basic assumptions for which the main current techniques and tools have been developed.
In some applications, the assumption that the involved probability distributions are independent is problematic. Therefore, an algorithm that attempts to remedy defective sources should be able to handle even sources with bounded independence. The project will study the issue of randomness extraction from two or more sources with bounded independence, in contrast with the current literature that has only considered the case of fully independent sources. Extending some partial results, the objective is to establish upper bounds on the quality of randomness that can be obtained from such sources and to design efficient extractors that perform in the vicinity of the upper bounds. Another research line of the project is dedicated to exposure-resilient extractors, which are efficient procedures that manage to extract bits that look random even to an adversary that has adaptive access to the input sources. The objective is to design such extractors with parameters suitable for applications in cryptography.
Randomness extraction is a very active field and has been an incubator of ideas with a significant impact even outside the area. The proposed research opens new directions of study that are natural and challenging. It adds new dimensions in the study of randomness extraction and has the potential of enlarging the range of applications of extractors. Randomness extraction has applications in computational complexity, randomized algorithms, constructive combinatorics, cryptography, error-correcting codes, and other areas. The project attempts to make many of these applications more practical and more robust. The project will establish new connections between computational complexity, Kolmogorov complexity, and algorithmical information theory. The project will allow undergraduate and graduate students to participate in research activities that have a strong theoretical flavor and the promise of real-world applications.
This grant has studied questions on the relation between computation and randomness. Randomness is a powerful computational resource. For some problems, randomized algorithms are significantly faster than the best known deterministic algorithms. Furthermore, in some areas (e,g, cryptography, distributed computing, game theory, machine learning) the use of randomness is compulsory. While it seems that there are sources of genuine randomness in Nature, they produce sequences of bits with various biases and correlations that are not suitable for direct employment in some applications. Therefore, an important problem is that of randomness extraction, which means producing sources of (close to) perfect randomness from sources with defective randomness. The quality of a source of randomness is measured either using probability distributions and different types of entropies, or by using the concepts of algorithmic information theory, also known as Kolmogorov complexity. In our research, we have primarily used the latter approach. Here we view the source as a string x, and randomness in x is given by its Kolmogorov complexity, C(x), defined as the length of the shortest description of x. We highlight some of the main results. In randomness extraction, one important distinction stems from the number of input defective sources that are available: one input source versus two pr more input sources. It is well known that effective randomness extraction from a single source of randomness is impossible. Thus such an extractor needs some additional information about the source x. The question is how much additional information is needed. One result that we have obtained is that if the amount of nonuniform information is constant, then one cannot extract a string whose complexity rate is 1, where the complexity rate of a string x is the ratio between C(x) and the length of x. On the positive side, we show that, basically with any non-constant amount of information it is possible to extract a string with complexity rate 1. The next possibility is to extract randomness from two defective sources and in this case one important parameter is the degree of dependency of the two sources. We have shown that there exists a computable Kolmogorov extractor such that, for any two n-bit strings with complexity s(n) and dependency a(n), it outputs a string of length s(n) with complexity s(n)- a(n) conditioned by any one of the input strings. The above parameters are the optimal parameters a Kolmogorov extractor can achieve. Interesting results have been obtained regarding the effective determination of the shortest description of a finite string (or even a description that is close within an additive constant to a shortest one). This is a canonical example of an uncomputable function. Many other problems regarding short descriptions are uncomputable. We have studied the extent to which one can effectively find an approximate solution, where ``approximate" is in the sense of list approximation. In list approximation, instead of achieving the ideal objective of constructing an object that has the desired property (in our case, a shortest, or close to shortest, description of a given finite string), we try to construct a short list of objects with the guarantee that one of them contains such an object. We have shown the surprising result that on input a finite string x it is possible to effectively construct a short list (where "short" means polynomial in n, the length of x) that contains a program for x that is within a constant from a shortest program. If a program for x has length C(x) + c, we say that it is a c-short program for x. The size of the list is n2, and we have also shown that any list that contains a c-short program for x must have size proportional to n2/(c+1)2. Actually, such a construction can be done very fast (in polynomial time), with a list of size approximately equal to n6. If we allow randomized constructions one can compute a list with only n elements guaranteed to contain an O(log n)-short program for the given string x. Even more, one can compute in polynomial time a list with n elements guaranteed to contain a O(log2 n)-short program for x. An interesting aspect is that the polynomial randomized construction requires only O(log2 n) random bits. Such a list cannot be computed deterministically regardless of the running time. Therefore, the result gives a natural example of a task that can be solved probabilistically in polynomial time and with few random bits and which cannot be solved deterministically at all. One other example showing this "absolute" power of randomized algorithms was known in the literature, but that was a trivial and unconvincing example. Other interesting results have been obtained regarding language compression, effective construction of Ramsey graphs and van der Waerden sequences, symmetry of information, independence amplification, counting results in Kolmogorov complexity.