In real-time voice and high-speed data applications, limited delay is a key design constraint; indeed, packet sizes as short as a few hundred bits are common in wireless systems. The objective of this research is to go beyond traditional refinements to the fundamental asymptotic information theoretic limits and investigate the back-off from capacity (in channel coding) and the overhead over entropy (in lossless compression) and the rate-distortion function (in lossy source coding) incurred by coding at a given blocklength. We plan to revisit the major design principles stemming from the analysis of capacity, rate-distortion function and minimum source coding rate and see which of them still apply in the non-asymptotic regime, and for those that do not, assess the penalty incurred by abiding by them for short blocklengths.
Our study of the non-asymptotic behavior of the optimum rate achievable as a function of both blocklength and error probability involves two complementary goals: a) computable upper and lower bounds tight enough to reduce the uncertainty on the non-asymptotic operational fundamental limit to a level that is negligible compared to the gap to the long-blocklength asymptotics; b) analytical approximations to the bounds that are accurate even for short blocklengths, so as to offer insights into good coding strategies and enable practically relevant optimization problems. Those approximations typically involve a parameter we refer to as dispersion, which quantifies the stochastic variability of sources and channels.
The goal of this project is to analyze the fundamental limits in information theory (data transmission, as well as lossless and lossy data compression) from a non-asymptotic perspective. Starting with Claude Shannon, the tradeoff between delay, transmission rate, and error probability in channel coding, has been solved in the asymptotic regime of large delay, in which the error probability can be made as small as desired as long as the transmission rate is below the channel capacity. In many applications of contemporary interest, such as high-speed wireless data transmission, the tolerable delays are quite small; in fact, blocklengths of the order of 1000 bits are quite common. How close to capacity can we transmit information at such blocklengths? This project has made a giant step in answering that fundamental question. Substantially, up to now there were three different approaches to the constructive side of the channel fundamental limit: Shannon (1948), Feinstein (1954) and Gallager (1965). Going beyond those established approaches, we have come up with the tightest known achievability lower bounds. Our new results are simple enough that can be taught in information theory courses. As far as upper (converse) bounds, the most conventional approach based on a classical inequality by Fano (1952) is very loose except in the very long blocklength regime; tighter results exist due to Wolfowitz (1960), and Shannon, Gallager and Berlekamp (1967). Not only have we been able to get even tighter results, but we have devised a clever technique, which we call a meta-converse that offers a systematic way to obtain new results as well as re-discover existing converses. We have applied this new machinery to simple memoryless channels and the results have been surprisingly tight. Moreover, we have shown simple analytical approximations to those bounds, which are quite accurate for blocklengths even as low as 200. Those approximations are governed by, in addition to channel capacity, what we refer to as channel dispersion: a fundamental parameter that determines the level of delay required to overcome the statistical fluctuations of the channel randomness. In addition to memoryless channels, we have applied this approach to channels with dynamics such as Markov discrete channels and Gaussian channels with fading. In contrast to capacity, the channel dispersion does indeed depend on the speed of the channel dynamics, thereby revealing the length of the required coding horizon. We have also spent considerable effort in finding the fundamental limits of communication in the presence of feedback. The traditional assessment of the reliability increase afforded by feedback follows an asymptotic large-deviations approach, which has failed to make an impact in actual design. We have taken a completely different non-asymptotic route, which shows that the reduction in required delay brought about by the availability of feedback can be quite dramatic even if it is only of the decision-feedback variety. Particularly striking is the usefulness of feedback in terms of energy per bit in Gaussian channels when only a finite number of bits is to be transmitted. In lossy compression, the state of the art prior to this project on the fundamental non-asymptotic tradeoff for a given excess distortion probability comprised Shannon’s 1959 achievability theorem and Marton’s 1974 converse theorem. They are good enough to give tight results in the limit even in the large deviations regime, but they are woefully inadequate to serve as an indication for what can be achieved in the regime of 1000 samples. This is in stark contrast to the bounds we found in this project. Another new result I teach in my class gives a very general and simple converse bound, which turns out to be quite tight in the short blocklength regime. The key is the introduction of the notion of d-tilted information density, which plays the same role as information density in channel coding (without cost constraints). We have also been able to obtain the exact performance of Shannon’s random reproduction point selection and gone on to obtain very tight achievability bounds in terms of the generalized d-tilted information density. Another important contribution is the introduction of the rate-dispersion function which together with the rate-distortion function gives a very succinct way to approximate the finite-blocklength bounds for all but super-small blocklengths. Another highlight is the best achievable joint source-channel coding in the non-asymptotic regime. We have made the most convincing case to date for the benefits of combined compression and transmission in applications where delay is severely constrained. We have also revisited the "code-or-not-to-code" paradigm, studied by Goblick, Csiszár-Körner, and Gastpar, among others, who noticed that for some lucky marriages of channels, sources, and distortion measures, symbol-by-symbol coding was optimal. We have demostrated that even when one is not so lucky, symbol-by-symbol coding may be the best available choice in the short blocklength regime, even if it is suboptimal in the long run.