This project will investigate the listener's ability to understand speech produced by different talkers. This perceptual ability (which is called speaker normalization) is one example of the adaptiveness of speech perception, and so a better understanding of how listeners adapt to different talkers may lead to a better understanding of the adaptive perceptual processes which take place in response to hearing impairment. In spite of its intrinsic and practical importance, speaker normalization has not been a focus of interest for most speech perception researchers. The project will compare two models of speaker normalization. The mediated normalization model holds that the perception of speech is cognitively mediated by information about the speaker, whereas the immediate normalization model holds that information in the speech signal resulting from acoustic differences between speakers is auditorily integrated with linguistic acoustic cues in such a way that differences between speakers are eliminated. These two models will be contrasted in 13 experiments involving tests of (1) the role of familiarity with particular speakers in speech perception, (2) the role of secondary acoustic cues for speaker identity (such as breathiness) and visual information about the speaker, and (3) speaker normalization effects for plosive consonants and nasals. The proposed experiments make use of synthetic speech or digitally processed natural speech and involve manipulating acoustic cues for speaker identity, interstimulus interval, and presentation type (randomly mixing speakers or blocking stimuli by speaker) while observing the impact of these manipulations on the perception of vowels, fricatives, stops, and nasals. In addition to the experiments, the proposed program of research involves the construction of mathematical models of speech perception. Exemplar-based implementations of both the mediated and immediate normalization models will be constructed and tested against the data generated by the experiments.
Balakrishnan, Uma; Freyman, Richard L (2002) Lateralization and detection of pulse trains with alternating interaural time delays. J Acoust Soc Am 112:1605-16 |
Freyman, R L; Balakrishnan, U; Helfer, K S (2001) Spatial release from informational masking in speech recognition. J Acoust Soc Am 109:2112-22 |
Freyman, R L; Helfer, K S; McCall, D D et al. (1999) The role of perceived spatial separation in the unmasking of speech. J Acoust Soc Am 106:3578-88 |
Johnson, K; Ralston, J V (1994) Automaticity in speech perception: some speech/nonspeech comparisons. Phonetica 51:195-209 |