Watching a speaker's face and lips provides powerful information in speech perception and language understanding. Visible speech is particularly effective when the auditory speech is degraded, because of noise, bandwidth filtering, or hearing impairment. The proposed research involves three main areas of inquiry on the use of visible information in speech perception. The first area involves research and development of computer animated facial displays. Synthetic visible speech has a great potential for advancing our knowledge about the visible information in speech perception, how it is utilized by human perceivers, and combined with auditory speech. But a better model of speech articulation is needed- incorporating physical measurements from real speech and rules describing coarticulation between segments. Further work is proposed to increase the available information and to improve the realism of the face. Standard tests of intelligibility will be used to assess the quality of the facial synthesis. The second area of inquiry is the measurement of facial movements and tongue during speech production, and analysis of features used by human observers rn visual-auditory speech perception. Systematic measurements of visible speech will be made using a computer controlled video motion analyzer. These measurements will be used for control of synthetic visual speech and also will be correlated with perceptual measures to identify which physical characteristics are actually used by human observers. The third area evaluates the contribution of facial information in general (and various visual features in particular) to speech perception. Experimental studies with human observers will be carried out to assess the quality of the synthetic facial display and to better understand speech perception by eye and ear. Synthetic visible speech will allow the visual signal to be manipulated directly, an experimental feature central to the study of psychophysics and perception. Although these three areas of inquiry address different problem domains in cognitive science and engineering, their simultaneous study affords potential developments not feasible in separate investigations. The general hypotheses examined in this research are that l) animated visual speech from synthetic talkers is a valuable communication medium 2) research with this medium will contribute our understanding of speech perception by ear and by eye, and 3) the research will have valuable applications for improving communication for deaf and hearing-impaired individuals, people in noisy environments, people in difficult language situations such as second language learning, and human-machine interactions.
Showing the most recent 10 out of 17 publications