Links Between Production and Perception in Speech

Whalen, Doug

Abstract

Coordinated actions of the many components of the vocal tract during speech produce a complex acoustic signal. Because the main signal is sound, it is often assumed that we can find the important aspects of that sound, without regard to the speech gestures that gave rise to it. Some acoustic signatures, however, were only discovered (on the previous cycle of this grant) by making predictions from the way the tongue moves. Although previous researchers had claimed that perceptual recovery of such gestural information is impossible, it turns out that, for the regions useful for speech, there are enough constraints to make the computation solvable. The current research will extend those results from vowels to the more critical consonants, and to show that listeners make use of the signatures of the articulation. Two main classes of theories, gestural and acoustic, differ in their treatment of how this acoustic evidence is learned. Acoustic theories attribute it to learning during babbling, while gestural theories assert that the constraints of the vocal tract are sufficient. The gestural hypothesis that listeners make use of all aspects of a gesture predicts that even unfamiliar information will be used, while the acoustic theory leads us to expect that prior experience is needed. We have found that unusual gestural correlates, such as a puff of air, are used perceptually as well, despite not being learned.
A second aim of the research is to expand those findings to even more unusual sources of information (e.g., visual evidence of a flickering of a candle near the speaker's mouth). These air puffs, called aspiration, are not used by all languages, however, and we will test whether active, linguistic use of aspiration is necessary for using these gestural cues. These results will shape our understanding of the fundamental organization of speech and its learning. Learning a second language, whether it is English or one of the worlds's many other languages, is often hampered by difficulty with the new sounds the other language uses. This project has as a third aim to apply the results of the basic studies addressing its first two aims to exploration of new ways of training language learners in producing novel sounds. To the extent that speech perception is tightly linked to production, then providing feedback on production of the sounds that are imperfectly learned should increase success. Here, the feedback will be provided by ultrasound images of the tongue during difficult sounds. An example for those learning English is the mastery of the /l/ and /r/ sounds. For English speakers learning another language, an example is the trilled /r/ of Spanish. The studies proposed here are expected to provide new ways of improving second language learning.

Public Health Relevance

The project addresses ways in which the acoustic speech signal can be used by listeners to extract the underlying linguistically significant movements of the vocal tract. The research will show which acoustic information is important, that perceivers also use non-acoustic information, and that use of speech production feedback improves second language learning.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute on Deafness and Other Communication Disorders (NIDCD)
Type: Research Project (R01)
Project #: 5R01DC002717-16
Application #: 8704366
Study Section: Language and Communication Study Section (LCOM)
Program Officer: Shekim, Lana O

Project Start: 1996-05-01
Project End: 2017-07-31
Budget Start: 2014-08-01
Budget End: 2015-07-31
Support Year: 16
Fiscal Year: 2014
Total Cost
Indirect Cost

Institution

Name: Haskins Laboratories, Inc.
Department
Type
DUNS #

City: New Haven
State: CT
Country: United States
Zip Code: 06511

Related projects

Publications

Derrick, Donald; Carignan, Christopher; Chen, Wei-Rong et al. (2018) Three-dimensional printable ultrasound transducer stabilization system. J Acoust Soc Am 144:EL392

Whalen, D H; Chen, Wei-Rong; Tiede, Mark K et al. (2018) Variability of articulator positions and formants across nine English vowels. J Phon 68:1-14

Krivokapi?, Jelena; Tiede, Mark K; Tyrone, Martha E (2017) A Kinematic Study of Prosodic Structure in Articulatory and Manual Gestures: Results from a Novel Method of Data Collection. Lab Phonol 8:

Abramson, Arthur S; Whalen, D H (2017) Voice Onset Time (VOT) at 50: Theoretical and practical issues in measuring voicing distinctions. J Phon 63:75-86

Roon, Kevin D; Gafos, Adamantios I (2016) Perceiving while producing: Modeling the dynamics of phonological planning. J Mem Lang 89:222-243

Shadle, Christine H; Nam, Hosung; Whalen, D H (2016) Comparing measurement errors for formants in synthetic and natural vowels. J Acoust Soc Am 139:713-27

Dawson, Katherine M; Tiede, Mark K; Whalen, D H (2016) Methods for quantifying tongue shape and complexity using ultrasound imaging. Clin Linguist Phon 30:328-44

Whalen, D H (2016) Direct Perceptions of Carol Fowler's Theoretical Perspective. Ecol Psychol 28:183-187

Bicevskis, Katie; Derrick, Donald; Gick, Bryan (2016) Visual-tactile integration in speech perception: Evidence for modality neutral speech primitives. J Acoust Soc Am 140:3531

Jackson, Eric S; Tiede, Mark; Riley, Michael A et al. (2016) Recurrence Quantification Analysis of Sentence-Level Speech Kinematics. J Speech Lang Hear Res 59:1315-1326

Showing the most recent 10 out of 72 publications

Comments

Be the first to comment on Doug Whalen's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: