The investigation and development of a new algorithmic framework for media representation is proposed, with emphasis on digital images and video. In particular, a new architecture for visual information representation is detailed based on the notion of programmable terminals, essentially considering the decoder as a Turing machine. In such a framework, the algorithm becomes itself the content, rather than just being used to process content. The benefits of such an approach are numerous, and can be appreciated by considering the revolutionary benefits derived from the transition of computer usage from the days of simple text terminals for number-crunching mainframes to today's powerful graphical user interfaces that host Web browsers. Computers were initially considered as calculation devices, but are not mostly used to generate elementary forms of algorithmic content, such as windows, menus, icons, fonts, etc., that empower users to perform advanced text document creation, distribution, access, etc. The proposed framework generalizes this paradigm to the case of digital images and video, and allows its use in a wide range of applications. The shift towards algorithmic representations requires both theoretical and applied innovations. From a theoretical standpoint, it is shown that a new mathematical theory can be developed (called Complexity Distortion Theory), providing a universal mathematical framework in which all coding techniques (from traditional entropy coding to model-based video coding) can be appropriately analyzed. This theory builds on the notion of a programmable terminal and promises significant results for real systems that operate under time and space constraints, or when compression is only one of the representation requirements. From a practical, applications-oriented standpoint, this new approach allows the elimination of decades of lag of audio-visual information representation with respect to modern software design. A software-based archi tecture is described, centered around a new programming language paradigm (FLAVOR - Formal Language for Audio-Visual Object Representation), and developed to provide the basis for real, functioning systems. Several applications and application domains are discussed in which the proposed approach can have a fundamental impact, empowering users with seamless creation, manipulation, and access to visual information.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
9703163
Program Officer
John Cozzens
Project Start
Project End
Budget Start
1997-09-15
Budget End
2001-08-31
Support Year
Fiscal Year
1997
Total Cost
$205,000
Indirect Cost
Name
Columbia University
Department
Type
DUNS #
City
New York
State
NY
Country
United States
Zip Code
10027