Toward standardizing perceptual voice quality measures

Kreiman, Jody

Abstract

Perception of pathological voice quality is centrally important in clinical voice evaluation, but adequately quantifying the sound of a person's voice remains problematic. Data from studies completed during the previous funding period indicate that many difficulties associated with current measures of voice quality derive from the way in which quality is defined and measured. We propose the development of a psychoacoustic model of overall voice quality as an alternative to traditional ratings and acoustic analysis protocols. This psychoacoustic model will specify a set of perceptually-important acoustic parameters that combine to replicate and thereby quantify the overall, integral quality of a voice. We will first determine the minimal set of acoustic parameters required to produce a synthetic copy of any voice, such that listeners judge that the synthetic copy matches the quality of the original voice. This set will constitute a preliminary psychoacoustic model of voice quality. We will then refine and validate this psychoacoustic model by synthesizing copies of natural voices using only these model parameters. To the extent that listeners judge that the natural and synthetic tokens match exactly, the psychoacoustic model will be considered valid. Mismatches will be analyzed to determine what parameters should be added to or subtracted from the model. We will assess the relationship between changes in acoustic values and changes in the extent to which a voice deviates from normal. This will provide an explanatory model specifying how acoustic parameters combine and interact perceptually to determine the location of any voice sample along a continuum from """"""""better"""""""" to """"""""worse."""""""" Finally, we will investigate the link between perceptually-important acoustic, spectral changes and the associated alterations in glottal configuration. Such knowledge could identify targets for remediation that have the highest likelihood of producing vocal improvement during treatment.

Public Health Relevance

Measurement of voice quality is a prime concern in management of patients with voice disorders, but problems of reliability and validity persist for existing procedures for gathering such measures. The proposed research will establish a valid, reliable, theoretically-motivated alternative to current systems for measuring voice quality. By using confirmatory methods to establish causal links between acoustic variables and perceived voice quality, the proposed studies will enhance our understanding of the relationship between a voice signal and the perceptual response it evokes, leading to a standardized, perceptually-validated, objective protocol for clinical use. Further, these measures will enable us to generate and test preliminary hypotheses regarding the changes in glottal configuration that cause perceptually-important changes in vocal acoustics, thus providing some of the first experimental evidence linking perception to production. By establishing links among physiology, acoustics, and perception, this research may significantly advance clinical practice.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute on Deafness and Other Communication Disorders (NIDCD)
Type: Research Project (R01)
Project #: 5R01DC001797-20
Application #: 8335191
Study Section: Special Emphasis Panel (ZRG1-BBBP-J (02))
Program Officer: Shekim, Lana O

Project Start: 1992-12-01
Project End: 2016-08-31
Budget Start: 2012-09-01
Budget End: 2013-08-31
Support Year: 20
Fiscal Year: 2012
Total Cost: $540,712
Indirect Cost: $189,600

Institution

Name: University of California Los Angeles
Department: Surgery
Type: Schools of Medicine
DUNS #: 092530369

City: Los Angeles
State: CA
Country: United States
Zip Code: 90095

Related projects

Publications

Zhang, Zhaoyan (2018) Vocal instabilities in a three-dimensional body-cover phonation model. J Acoust Soc Am 144:1216

Park, Soo Jin; Yeung, Gary; Vesselinova, Neda et al. (2018) Towards understanding speaker discrimination abilities in humans and machines for text-independent short utterances of different speech styles. J Acoust Soc Am 144:375

Wu, Liang; Zhang, Zhaoyan (2017) A Computational Study of Vocal Fold Dehydration During Phonation. IEEE Trans Biomed Eng 64:2938-2948

Zhang, Zhaoyan (2017) Effect of vocal fold stiffness on voice production in a three-dimensional body-cover phonation model. J Acoust Soc Am 142:2311

Gerratt, Bruce R; Kreiman, Jody; Garellek, Marc (2016) Comparing Measures of Voice Quality From Sustained Phonation and Continuous Speech. J Speech Lang Hear Res 59:994-1001

Signorello, Rosario; Zhang, Zhaoyan; Gerratt, Bruce et al. (2016) Impact of Vocal Tract Resonance on the Perception of Voice Quality Changes Caused by Varying Vocal Fold Stiffness. Acta Acust United Acust 102:209-213

Kreiman, Jody (2016) On Peer Review. J Speech Lang Hear Res 59:480-3

Garellek, Marc; Samlan, Robin; Gerratt, Bruce R et al. (2016) Modeling the voice source in terms of spectral slopes. J Acoust Soc Am 139:1404-10

Titze, Ingo R; Baken, Ronald J; Bozeman, Kenneth W et al. (2015) Toward a consensus on symbolic notation of harmonics, resonances, and formants in vocalization. J Acoust Soc Am 137:3005-7

Kreiman, Jody; Garellek, Marc; Chen, Gang et al. (2015) Perceptual evaluation of voice source models. J Acoust Soc Am 138:1-10

Showing the most recent 10 out of 35 publications

Comments

Be the first to comment on Jody Kreiman's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: