With the emergence of Big Data platforms in social and life sciences, it is becoming of paramount importance to develop efficient lossless and lossy data compression methods catering to the need of such information systems. Although many near-optimal compression methods exist for classical text, image and video data, they tend to perform poorly on data which naturally appears in fragmented or ordered form. This is especially the case for so called ordinal data, arising in crowd-voting, recommender systems, and genome rearrangement studies. There, information is represented with respect to a ?relative,? rather than ?absolute? scale, and the particular constraints of the ordering cannot be properly captured via simple dictionary constructions. This project seeks to improve the operational performance of a number of data management, cloud computing and communication systems by developing theoretical, algorithmic and software solutions for ordinal data compaction.

The main goal of the project is to develop the first general and comprehensive theoretical framework for ordinal compression. In particular, the investigators propose to investigate new distortion measures for ordinal data and rate-distortion functions for lossy ordinal compression; rank aggregation and learning methods for probabilistic ordinal models, used for ordinal clustering and quantization; and smooth compression and compressive computing in the ordinal domain. The proposed analytical framework will also allow for addressing algorithmic challenges arising in the context of compressing complete, partial and weak rankings. The accompanying software solutions are expected to find broad applications in areas as diverse as theoretical computer science (sorting, searching and selection), machine learning (clustering and learning to rank), and gene prioritization and phylogeny (reconstruction of lists of influential genes and ancestral genomes, respectively).

Project Start
Project End
Budget Start
2015-09-01
Budget End
2019-08-31
Support Year
Fiscal Year
2015
Total Cost
$250,000
Indirect Cost
Name
University of Illinois Urbana-Champaign
Department
Type
DUNS #
City
Champaign
State
IL
Country
United States
Zip Code
61820