Data compression is a core component of all communication protocols, as it can translate to bandwidth savings, energy efficiency and low delay operations. In the traditional setup, an information source compresses its messages so that they can be communicated efficiently with the goal of ensuring accurate reconstruction at the destination. This project seeks to design compression schemes that are specifically tailored to Machine Learning applications: If the transmitted messages support a given learning task (e.g., classification or learning), the desired compression schemes should provide better support for the learning task instead of focusing on reconstruction accuracy. This approach to compression could potentially yield significant benefits in terms of communication efficiency, while simultaneously promoting the successful implementation of Machine Learning algorithms. By improving communication efficiency, such schemes are expected to contribute to the successful implementation of distributed machine learning algorithms over networks.

Traditionally, compression schemes are evaluated using rate-distortion trade-offs; this project is interested in rate-accuracy trade-offs, where accuracy captures the effect that quantization may have on a specific machine learning task. There is particular interest in information-theoretic lower bounds and trade-offs, and in explicit compression for the following two questions: (1) How to compress for model training, when we need to use distributed communication constrained nodes to learn a model, fast and efficiently; and (2) How to compress for communication during inference. The project will derive bounds and algorithms for distributed compression of features coming from composite distributions that will be used for a machine learning task, such as classification. This work will advance the state of the art, and build new connections between the areas of data compression and distributed machine learning.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Project Start
Project End
Budget Start
2020-08-01
Budget End
2023-07-31
Support Year
Fiscal Year
2020
Total Cost
$523,925
Indirect Cost
Name
University of California Los Angeles
Department
Type
DUNS #
City
Los Angeles
State
CA
Country
United States
Zip Code
90095