The goal of the proposed research is to extend our knowledge of the early stages of word recognition in which listeners extract individual segments from the speech signal. Long-standing accounts of speech perception, which emphasized the abstract nature of linguistic representations, have recently been challenged by findings that indicate that talker-specific, acoustic-phonetic information is retained in memory and can facilitate word recognition. These findings raise the possibility that detailed acoustic-phonetic information is used to customize the mapping between signal and segmental representation on a talker- specific basis. In support of this alternative account, there is now evidence that listeners can track acoustic- phonetic properties for a particular talker. One such property is voice-onset-time (VOT), a temporal property of speech that marks the voicing contrast in stop consonants. Listeners can learn a talker's characteristic VOTs in the context of one word-initial voiceless stop and, moreover, can transfer this information to a novel word that begins with the same stop. A fundamental question that remains unanswered concerns the level of representation at which talker-specific, acoustic-phonetic information is tracked.
The specific aim of the proposed research is to address this question by determining whether listeners track talker-specific VOT with respect to a phonetic feature or with respect to a given phonetic segment. During a training phase, listeners will learn how two talkers produce /p/ or /k/. Speech synthesis techniques will be used to manipulate the VOTs of the two talkers so that one talker has shorter VOTs and the other talker has longer VOTs. During a test phase, a two-alternative forced-choice task will be used to examine transfer to words that begin with the same voiceless stop as used during training and to words that begin with a voiceless stop at a different place of articulation. If listeners track talker-specific VOT with respect to a phonetic feature, then information learned about voiceless stop consonants in the context of /p/ should transfer to /k/ and information learned in the context of /k/ should transfer to /p/. However, if listeners track talker-specific VOT with respect to a given phonetic segment, then transfer should be limited only to words that begin with the same voiceless stop as used during training. This research will contribute to the theoretical understanding of talker specificity in speech perception as well as support the advancement of devices that recognize normal and disordered speech. One current limitation of such devices is the failure to rapidly adapt to talker differences in speech production. Examining how humans process this type of phonetic variation will provide critical information to incorporate into machine recognition of spoken language. ? ? ? ?
Theodore, Rachel M; Miller, Joanne L (2010) Characteristics of listener sensitivity to talker-specific phonetic detail. J Acoust Soc Am 128:2090-9 |
Theodore, Rachel M; Miller, Joanne L; DeSteno, David (2009) Individual talker differences in voice-onset-time: contextual influences. J Acoust Soc Am 125:3974-82 |