Links Between Production and Perception in Speech

Whalen, Doug

Abstract

Our long-term goal is to understand how humans organize their brains and vocal tracts so that they can speak; only through understanding normal function can we see what happens with disorders. Although it is uncontroversial that most of the speech we hear is produced by a human vocal tract, it is less accepted that speech production and speech perception are intricately linked. Many theorists hold that the vocal tract's acoustic output is dealt with in a purely acoustic manner and that the link would be seen in modifications of the vocal tract shape to achieve particular acoustics. An alternative approach holds that speech consists of gestures (the coordinated activity of articulators), such as the jaw and the lips, achieving a phonetic goal, such as lip closure. The gestural model has allowed an insightful interpretation of many speech production phenomena, and the models have begun to have testable predictions for perceptual theories as well. The proposed experiments expand on this research, showing how perception of gestures is possible in automatic speech recognition, how the consequences of articulation?acoustic, visual and even haptic?are used by perceivers, and how accommodations are made for differences between speakers. This theoretical outlook has been fruitfully applied to problems in language acquisition, language change, and certain language disabilities. The advances from the proposed research should allow even broader applications. The goal is to show how acoustic parameters that cohere because of their origin in articulation are used by listeners. This will be accomplished by acoustical modeling of natural productions, perception of natural speech under modified circumstances (e.g., impaired by noise or enhanced by feeling the articulators saying what is being heard), and measurement of speech with ultrasound and optical markers. These measurements provide a basis for input to our configurable articulatory synthesizer, which can match the size and acoustic output of individual speakers. Stimuli generated from this synthesizer can test hypotheses about what is important in the production patterns we see. The results of these experiments will show more clearly than ever the tight link between production and perception of speech. Relevance: Speech is the primary means most humans use to communicate and maintain social relationships, but it is vulnerable to a range of disorders. We have to understand how it is that speech works normally so we can know what to do when things go wrong. Research along the lines in the present project has already contributed to other grants dealing with such disorders as Parkinson's disease and autism.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute on Deafness and Other Communication Disorders (NIDCD)
Type: Research Project (R01)
Project #: 5R01DC002717-11
Application #: 7387332
Study Section: Language and Communication Study Section (LCOM)
Program Officer: Shekim, Lana O

Project Start: 1996-05-01
Project End: 2011-03-31
Budget Start: 2008-04-01
Budget End: 2009-03-31
Support Year: 11
Fiscal Year: 2008
Total Cost: $626,960
Indirect Cost

Institution

Name: Haskins Laboratories, Inc.
Department
Type
DUNS #: 060010147

City: New Haven
State: CT
Country: United States
Zip Code: 06511

Related projects

Publications

Derrick, Donald; Carignan, Christopher; Chen, Wei-Rong et al. (2018) Three-dimensional printable ultrasound transducer stabilization system. J Acoust Soc Am 144:EL392

Whalen, D H; Chen, Wei-Rong; Tiede, Mark K et al. (2018) Variability of articulator positions and formants across nine English vowels. J Phon 68:1-14

Krivokapi?, Jelena; Tiede, Mark K; Tyrone, Martha E (2017) A Kinematic Study of Prosodic Structure in Articulatory and Manual Gestures: Results from a Novel Method of Data Collection. Lab Phonol 8:

Abramson, Arthur S; Whalen, D H (2017) Voice Onset Time (VOT) at 50: Theoretical and practical issues in measuring voicing distinctions. J Phon 63:75-86

Shadle, Christine H; Nam, Hosung; Whalen, D H (2016) Comparing measurement errors for formants in synthetic and natural vowels. J Acoust Soc Am 139:713-27

Dawson, Katherine M; Tiede, Mark K; Whalen, D H (2016) Methods for quantifying tongue shape and complexity using ultrasound imaging. Clin Linguist Phon 30:328-44

Whalen, D H (2016) Direct Perceptions of Carol Fowler's Theoretical Perspective. Ecol Psychol 28:183-187

Bicevskis, Katie; Derrick, Donald; Gick, Bryan (2016) Visual-tactile integration in speech perception: Evidence for modality neutral speech primitives. J Acoust Soc Am 140:3531

Jackson, Eric S; Tiede, Mark; Riley, Michael A et al. (2016) Recurrence Quantification Analysis of Sentence-Level Speech Kinematics. J Speech Lang Hear Res 59:1315-1326

Bicevskis, Katie; de Vries, Jonathan; Green, Laurie et al. (2016) EFFECTS OF MOUTHING AND INTERLOCUTOR PRESENCE ON MOVEMENTS OF VISIBLE VS. NON-VISIBLE ARTICULATORS. Can Acoust 44:17-24

Showing the most recent 10 out of 72 publications

Comments

Be the first to comment on Doug Whalen's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: