Visual data, including video imagery, conveys "information" about objects of interest within the scene: Shape, material, identity, relations, etc. However, it is also highly redundant, and subject to variability that has little to do with the properties of the scene of interest, but instead depend on the sensor, the vantage point, and the nature of the illuminant, etc. This project addresses the question of determining what function of imaging data should be inferred and stored, that is, as "informative" as possible for a class of tasks such as object or scene detection, localization, recognition and categorization, and at the same time as "compressed" as possible, and insensitive to nuisance variability. Such a function is called a Representation. This research has pedagogical value, by framing seemingly unrelated methods as different approximations of an ideal Representation, thus facilitating the educational process in Computer Vision. This is further expected to facilitate the design of better Representations, and therefore improved algorithms for visual recognition (detection, localization, recognition, and categorization) systems, with impact in a range of applications from autonomy (e.g., robotic navigation and surveillance) to interaction (e.g., assisted surgery and augmented reality).

The project frames the problem of inferring optimal task-specific Representations in terms of the Information Bottleneck Principle, and addresses issues of computability, approximation, and dimensionality reduction within this framework. It also addresses questions of "learnability," to determine whether a generic learning architecture can approximate an optimal representation. The Information Bottleneck is a generalization and relaxation of the notion of minimal sufficient statistic, where complexity constraints and task relevance are explicitly taken into account. The challenge is that modeling the generative process for visual data entails complex geometry (surface shape), topology (occlusions), photometry (material reflection, illumination), and dynamics (motion) with the object of interest living in infinite-dimensional spaces. Thus, the Information Bottleneck is difficult to even formalize, let alone instantiate, compute, and optimize. The project focuses on developing approximations of the Information Bottleneck that are tractable and yet enjoy performance guarantees.

Project Start
Project End
Budget Start
2014-07-15
Budget End
2018-06-30
Support Year
Fiscal Year
2014
Total Cost
$456,617
Indirect Cost
Name
University of California Los Angeles
Department
Type
DUNS #
City
Los Angeles
State
CA
Country
United States
Zip Code
90095