Towards Standardizing Perceptual Voice Quality Measures

Kreiman, Jody

Abstract

Judgments of voice quality contribute greatly to patients'opinions of their voice, and to a clinician's decision to initiate and continue treatment. Quality judgments are also used as a standard against which instrumental measures of voice are validated. Nevertheless, these """"""""subjective"""""""" measures of voice quality are not highly regarded as either clinical or research tools, because of problems with reliability and untested validity. We hypothesize that problems in voice quality measurement do not originate within the listener, but rather derive from the methods used to measure what listeners hear. The proposed research departs from traditional rating scale methods by applying a speech synthesizer for pathological voice quality to study basic issues concerning reliability and validity of voice quality measures, and to determine the perceptual importance of the acoustic elements that underlie voice quality perception. In this approach, listeners adjust synthesizer parameters to achieve a perceptual match to the original voice. This method of adjustment technique explicitly links the acoustic signal to voice quality. We hypothesize that this linkage across the speech chain increases the validity, reliability, and utility of the resulting perceptual responses, by providing listeners with an objective tool (a synthesizer) for quantifying what they hear. The proposed research focuses on four specific aims. First, we will expand, refine, and automate our synthesizer, to increase power, flexibility, and ease of use. Secondly, we will apply the synthesizer to examine the perceptual importance of several key aspects of pathologic voices, including the shape of the source spectrum, period doubling, and the perceptual interactions of harmonic and inharmonic (noise) energy in the voice source. In each case, we will determine which acoustic features are perceptually important and whether these features interact with others, and we will estimate just-noticeable-differences for the parameter as an index of the amount of acoustic change that is clinically meaningful on that dimension. Thirdly, we will evaluate clinicians'ability to apply the synthesizer as a tool for measuring voice quality in the clinic. We will also revisit the use of synthetic anchors in rating protocols as an alternative to the method of adjustment task. Finally, we will evaluate the external validity of the synthesis technique by comparing perceptual measures derived from sustained vowels to those derived from continuous speech. By increasing the power and ease of the synthesizer's use, examining listener sensitivity to important acoustic variables, and studying the generalizability of this approach, the proposed research will bring our goal of a standardized voice quality evaluation protocol within reach.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute on Deafness and Other Communication Disorders (NIDCD)
Type: Research Project (R01)
Project #: 5R01DC001797-18
Application #: 7742617
Study Section: Motor Function, Speech and Rehabilitation Study Section (MFSR)
Program Officer: Shekim, Lana O

Project Start: 1992-12-01
Project End: 2011-09-20
Budget Start: 2009-12-01
Budget End: 2011-09-20
Support Year: 18
Fiscal Year: 2010
Total Cost: $520,076
Indirect Cost

Institution

Name: University of California Los Angeles
Department: Surgery
Type: Schools of Medicine
DUNS #: 092530369

City: Los Angeles
State: CA
Country: United States
Zip Code: 90095

Related projects

Publications

Park, Soo Jin; Yeung, Gary; Vesselinova, Neda et al. (2018) Towards understanding speaker discrimination abilities in humans and machines for text-independent short utterances of different speech styles. J Acoust Soc Am 144:375

Zhang, Zhaoyan (2018) Vocal instabilities in a three-dimensional body-cover phonation model. J Acoust Soc Am 144:1216

Wu, Liang; Zhang, Zhaoyan (2017) A Computational Study of Vocal Fold Dehydration During Phonation. IEEE Trans Biomed Eng 64:2938-2948

Zhang, Zhaoyan (2017) Effect of vocal fold stiffness on voice production in a three-dimensional body-cover phonation model. J Acoust Soc Am 142:2311

Gerratt, Bruce R; Kreiman, Jody; Garellek, Marc (2016) Comparing Measures of Voice Quality From Sustained Phonation and Continuous Speech. J Speech Lang Hear Res 59:994-1001

Signorello, Rosario; Zhang, Zhaoyan; Gerratt, Bruce et al. (2016) Impact of Vocal Tract Resonance on the Perception of Voice Quality Changes Caused by Varying Vocal Fold Stiffness. Acta Acust United Acust 102:209-213

Kreiman, Jody (2016) On Peer Review. J Speech Lang Hear Res 59:480-3

Garellek, Marc; Samlan, Robin; Gerratt, Bruce R et al. (2016) Modeling the voice source in terms of spectral slopes. J Acoust Soc Am 139:1404-10

Bagha, Ashok K; Modak, S V (2015) Structural sensing of interior sound for active control of noise in structural-acoustic cavities. J Acoust Soc Am 138:11-21

Titze, Ingo R; Baken, Ronald J; Bozeman, Kenneth W et al. (2015) Toward a consensus on symbolic notation of harmonics, resonances, and formants in vocalization. J Acoust Soc Am 137:3005-7

Showing the most recent 10 out of 35 publications

Comments

Be the first to comment on Jody Kreiman's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: