Knowledge-Based Document Image Understanding

Srihari, Sargur; Gaborski, Roger

Abstract

Understanding printed documents is an intelligent activity. This research is about automating one aspect of analyzing a document image and deriving a high-level representation of its visual content. Documents contain photographs and accompanying text. This effort is concerned with arriving at an integrated interpretation of the communicative unit consisting of photographs and their captions. When text describes salient aspects of a photograph, it is possible to use the text to direct a vision system in understanding the photograph. There are two components to this research: the first deals with language issues and the second with development of a vision subsystem. Methods of extracting visual information from text, specifically cues required to identify salient objects, are to be studied; such information may be present in a variety of forms, based on both syntax and semantics. The role of textually extracted visual cues in performing visual object recognition is also to be studied. As a test of the theory, it is proposed to develop a system where the result of parsing a caption of a newspaper photograph is used to identify human faces in the photograph. The face location subsystem will incorporate scale invariant techniques, and filters that characterize faces based on the presence of distinguishing visual features. //

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Application #: 9014110
Program Officer: Howard Moraff

Project Start
Project End
Budget Start: 1991-06-01
Budget End: 1993-05-31
Support Year
Fiscal Year: 1990
Total Cost: $188,888
Indirect Cost

Knowledge-Based Document Image Understanding
Srihari, Sargur Gaborski, Roger
Suny at Buffalo, Buffalo, NY, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments