The long-range goal of this project is to develop a model that simulates the human perception of the words in spoken language. The immediate goal in the present project is to achieve one part of this task: the identification of the distinctive features of consonants in spoken syllables, words and sentences. Key ideas in our approach are that the speech signal contains reliable acoustic cues to the articulation implemented by the speaker even when surface phonetics are variable, and that the articulation is governed by abstract contrastive phonological representations. Identification of the consonant features proceeds by first performing an acoustic analysis of the speech to establish the locations of landmarks or discontinuities in the sound where consonant closures and releases are formed. The sound in the vicinity of those landmarks is then subjected to further detailed analysis to establish the underlying features of the consonant that generates each landmark, including the place of articulation, the voicing feature, the nasal feature etc. This further analysis involves extracting from the sound a number of attributes that provide cues for each of the underlying consonant features. The selection of these attributes is guided by the requirement that they be closely related to the articulatory shapes and movements that produced the speech. In the proposed work, our current understanding of the combination of attributes that most effectively reveal the articulation and its governing features and segments will be expanded and refined, through detailed theory-driven acoustic measures, perceptual experimentation and appropriate statistical analysis. The robustness of the model will be evaluated using various kinds of utterances, from citation forms to running speech. The performance of the model will also be tested in speech that has been contaminated with noise, and the errors made by the model will be compared with those made by human listeners. The model has application in the study of speech perception by listeners with impaired hearing or by listeners in an environment in which speech is degraded. Understanding of these processes of speech perception can lead to improved approaches to the remediation of disorders of speech perception and production.

Agency
National Institute of Health (NIH)
Institute
National Institute on Deafness and Other Communication Disorders (NIDCD)
Type
Research Project (R01)
Project #
5R01DC002978-10
Application #
6894722
Study Section
Special Emphasis Panel (ZRG1-BBBP-3 (01))
Program Officer
Shekim, Lana O
Project Start
1996-05-01
Project End
2007-01-15
Budget Start
2005-06-01
Budget End
2007-01-15
Support Year
10
Fiscal Year
2005
Total Cost
$503,200
Indirect Cost
Name
Massachusetts Institute of Technology
Department
Internal Medicine/Medicine
Type
Schools of Arts and Sciences
DUNS #
001425594
City
Cambridge
State
MA
Country
United States
Zip Code
02139
Zhao, Sherry Y (2010) Stop-like modification of the dental fricative /d/: an acoustic analysis. J Acoust Soc Am 128:2009-20
Bohm, Tamás; Shattuck-Hufnagel, Stefanie (2009) Do listeners store in memory a speaker's habitual utterance--final phonation type? Phonetica 66:150-68
Stevens, Kenneth N (2002) Toward a model for lexical access based on acoustic landmarks and distinctive features. J Acoust Soc Am 111:1872-91