This is a three-year standard award. When people talk, they produce both acoustic and optical speech signals. When people with normal hearing and vision communicate face to face, they make use of both types of signals. Being able to see and hear a talker can enhance speech understanding in conditions of noisy or difficult-to-comprehend speech. The goals of this multidisciplinary project are to quantitatively characterize optical speech signals, examine how optical speech characteristics relate to acoustic and to physiologic speech characteristics, study several fundamental issues in human visual speech perception, and apply obtained knowledge to optical speech synthesis. Across the entire project, the main questions we address are: (1) What speech information can perceivers get from seeing talkers? (2) How are optical and acoustic signals related to underlying speech articulations? (3) What are the perceptual and neurophysiological bases for visual speech perception? and (4) Can we demonstrate usefulness of this knowledge for developing synthesis of artificial talking faces? A multi-talker database is being recorded for this project. Recordings include acoustic, optical (with retroreflector labeling on the faces), and physiologic signals (Electromagnetic Midsaggital Articulography -- EMA). Studies follow up on recent results in the literature showing high correlations between acoustic and optical speech measures, and between external (optical) and internal (physiologic) speech measures. Studies include perceptual experiments to determine segmental and prosodic speech characteristics. We are investigating the neurophysiologic bases for visual speech perception in deaf and hearing adults using electrophysiologic measures. Optical speech synthesis is being employed to (1) test our understanding of the cues that control visual perception of phonemes and prosody, and (2) investigate the neurophysiological bases for human sensitivity to optical speech characteristics.

The project will impact engineering in the areas of speech synthesis and audiovisual automatic speech recognition. It will extend understanding of human speech perception and its neurophysiologic bases in deaf and hearing individuals. The applications that will derive from the project include ones in second language training, enhancement of speech transmission quality and recognition accuracy under conditions of environmental noise, efficient storage and transmission of optical speech information, stimulus control in audiovisual perceptual experiments, and communication enhancement for hearing impaired people. The multidisciplinary team of principal investigators represent the fields of cognitive science, speech perception, linguistics, electrical engineering, and neurophysiology.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
9872849
Program Officer
Ephraim P. Glinert
Project Start
Project End
Budget Start
1998-10-01
Budget End
1998-12-31
Support Year
Fiscal Year
1998
Total Cost
$1,400,000
Indirect Cost
Name
University of California Los Angeles
Department
Type
DUNS #
City
Los Angeles
State
CA
Country
United States
Zip Code
90095