The long-range goal of this project is to develop a model that simulates the human perception of the words in spoken language. The immediate goal in the present project is to achieve one part of this task: the identification of the distinctive features of consonants in spoken syllables, words and sentences. Key ideas in our approach are that the speech signal contains reliable acoustic cues to the articulation implemented by the speaker even when surface phonetics are variable, and that the articulation is governed by abstract contrastive phonological representations. Identification of the consonant features proceeds by first performing an acoustic analysis of the speech to establish the locations of landmarks or discontinuities in the sound where consonant closures and releases are formed. The sound in the vicinity of those landmarks is then subjected to further detailed analysis to establish the underlying features of the consonant that generates each landmark, including the place of articulation, the voicing feature, the nasal feature etc. This further analysis involves extracting from the sound a number of attributes that provide cues for each of the underlying consonant features. The selection of these attributes is guided by the requirement that they be closely related to the articulatory shapes and movements that produced the speech. In the proposed work, our current understanding of the combination of attributes that most effectively reveal the articulation and its governing features and segments will be expanded and refined, through detailed theory-driven acoustic measures, perceptual experimentation and appropriate statistical analysis. The robustness of the model will be evaluated using various kinds of utterances, from citation forms to running speech. The performance of the model will also be tested in speech that has been contaminated with noise, and the errors made by the model will be compared with those made by human listeners. The model has application in the study of speech perception by listeners with impaired hearing or by listeners in an environment in which speech is degraded. Understanding of these processes of speech perception can lead to improved approaches to the remediation of disorders of speech perception and production.
Zhao, Sherry Y (2010) Stop-like modification of the dental fricative /d/: an acoustic analysis. J Acoust Soc Am 128:2009-20 |
Bohm, Tamás; Shattuck-Hufnagel, Stefanie (2009) Do listeners store in memory a speaker's habitual utterance--final phonation type? Phonetica 66:150-68 |
Stevens, Kenneth N (2002) Toward a model for lexical access based on acoustic landmarks and distinctive features. J Acoust Soc Am 111:1872-91 |