When auditory speech stimuli are degraded due to external factors such as noise or internal factors such as hearing loss, being able to see the talker typically improves speech perception. This effect is usually explained as the result of auditory and visual speech information combining, so that more speech information is available to the perceiver. In this project, we will examine the novel hypothesis that the visual information can also guide perceptual learning of the information in the auditory speech stimulus. Auditory speech perception is altered as a result of experience or training with audiovisual (AV) speech stimuli. We hypothsize that the basis for the perceptual learning effect is the ability to exploit correlations or contingencies between auditory and visual speech cues in the input stimuli. These relationships exist because the biomechanics of speech produce both sights and sounds;and perceivers gain implicit knowledge of these audiovisual relationships in the speech they encounter in daily life. Experiments on auditory speech perceptual learning will be carried out with normal hearing and sighted adults. The stimuli will derive from natural recordings and the acoustic speech will be degraded by vocoding. The video will be either natural or synthetic. Training will use a paired-associates task in which participants will learn associations between each two-syllable nonsense word and its assigned nonsense picture. The measure of learning will be scores during training and scores in an auditory-only test that follows paired-associates training. The measure of generalization will be consonant identification in new two-syllable nonsense words, before any training and at the conclusion of the experiment. In Exp. 1, the necessity for synchronized AV speech to achieve optimal auditory learning will be tested with synchronized and desynchronized speech, printed words, and auditory-only control conditions. In Exp. 2, the ability of visual speech to distort auditory speech perceptual representations will be tested by mismatching the auditory and visual stimuli during training. In Exp. 3, AV conditions with natural or synthesized video stimuli will be compared to test whether the quality of the visual speech information affects auditory perceptual learning. In Exp. 4, attention to auditory speech cues during AV training will be challenged by training with different talkers for each paired-associate. Clinical relevance: Perceptual learning is critical to successful use of sensory prostheses, for example, cochlear implants and hearing aids. Perceptual learning under unisensory conditions can be limited by the necessity to access new stimulus information based only on the information delivered through the unisensory prosthesis. Multisensory stimuli with natural correlations or contingencies delivered to the senses in realtime could guide unisensory perceptual learning more effectively and efficiently. Research is needed to understand how multisensory learning achieves these goals, and how to develop practical training regimes.

Public Health Relevance

Auditory perceptual learning is critical to successful use of sensory prostheses, for example, cochlear implants and hearing aids. Audiovisual training can promote more effective and more efficient auditory perceptual learning if the training is done correctly. This project seeks to determine the correct audiovisual conditions that are needed for improvement in auditory speech perception.

National Institute of Health (NIH)
National Institute on Deafness and Other Communication Disorders (NIDCD)
Exploratory/Developmental Grants (R21)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Platt, Christopher
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
George Washington University
Other Health Professions
Schools of Arts and Sciences
United States
Zip Code
Kruger, Claudia; Kappen, Claudia (2010) Expression of cartilage developmental genes in Hoxc8- and Hoxd4-transgenic mice. PLoS One 5:e8978