A video sequence is a rich multimodal information source, including speech, text, audio (non-speech portion), color patterns and shapes of imaged objects (reflected in individual frames), and motion of these objects (revealed by changes between frames). Although the human being can quickly interpret the embedded semantic content from the information carried by different modalities, computer understanding of a video sequence is still in a primitive stage. The aim of this project is to develop new theory and techniques for scene segmentation and classification in a video sequence, which is key to video understanding. Research in this arena has in the past several years focused on the use of text, speech and image information. The proposed research explores the use of motion and audio characteristics, which will provide important complimentary information. New results are anticipated both in the general theory of feature analysis and classification, and in practical techniques for video understanding and scene classification. These new developments will have direct applications in information indexing and retrieval in multimedia databases, spotting and tracking of special events in surveillance video, video editing and movie stratification, etc.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
9619114
Program Officer
Ephraim P. Glinert
Project Start
Project End
Budget Start
1997-03-01
Budget End
2001-02-28
Support Year
Fiscal Year
1996
Total Cost
$538,520
Indirect Cost
Name
Polytechnic University of New York
Department
Type
DUNS #
City
Brooklyn
State
NY
Country
United States
Zip Code
11201