Judgments of voice quality contribute greatly to patients' opinions of their voice, and to a clinician's decision to initiate and continue treatment. Quality judgments are also used as a standard against which instrumental measures of voice are validated. Nevertheless, these """"""""subjective"""""""" measures of voice quality are not highly regarded as either clinical or research tools, because of problems with reliability and untested validity. We hypothesize that problems in voice quality measurement do not originate within the listener, but rather derive from the methods used to measure what listeners hear. The proposed research departs from traditional rating scale methods by applying a speech synthesizer for pathological voice quality to study basic issues concerning reliability and validity of voice quality measures, and to determine the perceptual importance of the acoustic elements that underlie voice quality perception. In this approach, listeners adjust synthesizer parameters to achieve a perceptual match to the original voice. This method of adjustment technique explicitly links the acoustic signal to voice quality. We hypothesize that this linkage across the speech chain increases the validity, reliability, and utility of the resulting perceptual responses, by providing listeners with an objective tool (a synthesizer) for quantifying what they hear. The proposed research focuses on four specific aims. First, we will expand, refine, and automate our synthesizer, to increase power, flexibility, and ease of use. Secondly, we will apply the synthesizer to examine the perceptual importance of several key aspects of pathologic voices, including the shape of the source spectrum, period doubling, and the perceptual interactions of harmonic and inharmonic (noise) energy in the voice source. In each case, we will determine which acoustic features are perceptually important and whether these features interact with others, and we will estimate just-noticeable-differences for the parameter as an index of the amount of acoustic change that is clinically meaningful on that dimension. Thirdly, we will evaluate clinicians' ability to apply the synthesizer as a tool for measuring voice quality in the clinic. We will also revisit the use of synthetic anchors in rating protocols as an alternative to the method of adjustment task. Finally, we will evaluate the external validity of the synthesis technique by comparing perceptual measures derived from sustained vowels to those derived from continuous speech. By increasing the power and ease of the synthesizer's use, examining listener sensitivity to important acoustic variables, and studying the generalizability of this approach, the proposed research will bring our goal of a standardized voice quality evaluation protocol within reach.

Agency
National Institute of Health (NIH)
Institute
National Institute on Deafness and Other Communication Disorders (NIDCD)
Type
Research Project (R01)
Project #
5R01DC001797-16
Application #
7320267
Study Section
Motor Function, Speech and Rehabilitation Study Section (MFSR)
Program Officer
Shekim, Lana O
Project Start
1992-12-01
Project End
2010-11-30
Budget Start
2007-12-01
Budget End
2008-11-30
Support Year
16
Fiscal Year
2008
Total Cost
$495,932
Indirect Cost
Name
University of California Los Angeles
Department
Surgery
Type
Schools of Medicine
DUNS #
092530369
City
Los Angeles
State
CA
Country
United States
Zip Code
90095
Zhang, Zhaoyan (2018) Vocal instabilities in a three-dimensional body-cover phonation model. J Acoust Soc Am 144:1216
Park, Soo Jin; Yeung, Gary; Vesselinova, Neda et al. (2018) Towards understanding speaker discrimination abilities in humans and machines for text-independent short utterances of different speech styles. J Acoust Soc Am 144:375
Wu, Liang; Zhang, Zhaoyan (2017) A Computational Study of Vocal Fold Dehydration During Phonation. IEEE Trans Biomed Eng 64:2938-2948
Zhang, Zhaoyan (2017) Effect of vocal fold stiffness on voice production in a three-dimensional body-cover phonation model. J Acoust Soc Am 142:2311
Gerratt, Bruce R; Kreiman, Jody; Garellek, Marc (2016) Comparing Measures of Voice Quality From Sustained Phonation and Continuous Speech. J Speech Lang Hear Res 59:994-1001
Signorello, Rosario; Zhang, Zhaoyan; Gerratt, Bruce et al. (2016) Impact of Vocal Tract Resonance on the Perception of Voice Quality Changes Caused by Varying Vocal Fold Stiffness. Acta Acust United Acust 102:209-213
Kreiman, Jody (2016) On Peer Review. J Speech Lang Hear Res 59:480-3
Garellek, Marc; Samlan, Robin; Gerratt, Bruce R et al. (2016) Modeling the voice source in terms of spectral slopes. J Acoust Soc Am 139:1404-10
Titze, Ingo R; Baken, Ronald J; Bozeman, Kenneth W et al. (2015) Toward a consensus on symbolic notation of harmonics, resonances, and formants in vocalization. J Acoust Soc Am 137:3005-7
Kreiman, Jody; Garellek, Marc; Chen, Gang et al. (2015) Perceptual evaluation of voice source models. J Acoust Soc Am 138:1-10

Showing the most recent 10 out of 35 publications