This research in gesture, speech and gaze investigates automatic discourse segmentation of multimodal communication data. The goal is to discern discourse structure from video and voice analysis of data that one can legitimately expect from common video and its audio track. The research will address the interpretation of gesture, speech, and gaze in discourse management, utilizing psycholinguistic models to explain how these modalities combine to express discourse structure; specifically, by developing algorithms for recognizing 'catchments' as empirically grounded thematic segments based on the partial recurrence of prosodic, gaze and gesture features during natural discourse. The research will be integrated into a hierarchical model that is both amenable to computational implementation and reflective of human communicative realities. The approach involves experiments designed to discover and quantify cues in the various modalities, and their relation with respect to discourse management; the development of computational algorithms to detect and recognize such cues; and the integration of these cues into a cogent discourse management system. The team, comprising psycholinguistic, machine vision and signal processing researchers, gains strength from its interdisciplinary scope. Technology developed will have significant impact on natural language understanding, human-computer interaction, and discourse and video databases.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
9618887
Program Officer
Ephraim P. Glinert
Project Start
Project End
Budget Start
1997-03-01
Budget End
2000-07-31
Support Year
Fiscal Year
1996
Total Cost
$748,378
Indirect Cost
Name
University of Illinois at Chicago
Department
Type
DUNS #
City
Chicago
State
IL
Country
United States
Zip Code
60612