A Comprehensive Psychoacoustic Approach to Voice Quality Perception

Eddins, David; Shrivastav, Rahul

Abstract

Voice disorders often lead to changes in voice quality noticed by patients, clinicians, and conversation partners. Assessment of voice quality is essential for diagnosis, and improvement in voice quality is a critical outcome of treatment. However, our knowledge of how voice quality is perceived is limited and the availability of robust measures of clinical outcome is even more limited. This has restricted our ability to accurately quantify or describe changes in quality, such as declines due to a disease or improvements resulting from treatment. This continuation project combines concepts and techniques from voice science, speech science, hearing science, and engineering to address these problems. The comprehensive approach to the proposed research simultaneously envelops the three primary VQ dimensions, embraces their covariance, improves measurement methods, and expands ecological validity through connected speech evaluation. The research proceeds by first obtaining high-precision measures of voice quality perception in the laboratory. These data are then used to develop mathematical models of voice quality perception that accurately reflect listeners? data. To obtain a close match between human judgments of voice quality and model output, models of auditory processing are used to obtain an internal representation of the voice acoustic signal. Specific measures are then captured from this internal auditory representation and used to model the perception of voice quality. In the proposed work, these methods will be used to establish a comprehensive framework for understanding and measuring voice quality perception and to enable translation to clinical practice. The specific goals of this project are to: (1) develop a more complete understanding of the nature of VQ covariance (among breathy, rough, and strain) using natural dysphonic voices; (2) assess VQ in connected speech at micro (segmental) and macro (whole utterance) levels to better capture the impacts of rapid and complex transitions, co- articulatory and prosodic variations, and unique VQ signatures related to specific pathologies; (3) incorporate all of these components into a novel, three-dimensional VQ scaling procedure that is simple, intuitive, efficient enough to be used clinically, and has ratio-level measurement properties; (4) use highly-predictive computational models to overcome many of the limitations of acoustic analyses and perceptual evaluation; and (5) evaluate full range of psychometric properties considered in test design and evaluation including reliability, validity, sensitivity, and specificity. Perhaps most importantly, the goal is to develop an assessment that accurately indicates responsiveness to change (i.e., disorder progression, treatment). These parameters are considered essential for longitudinal assessment and outcome measurement. The feasibility of using these models and metrics in regular clinical assessment will be evaluated in multiple voice pathologies through a clinical field-trial and coordinated offline assessments by clinicians and laboratory subjects.

Public Health Relevance

The goal of this work is to transform dysphonic voice quality assessment in clinical practice and in research by establishing new psychometric scales that accommodate sustained phonation and connected speech. The multidimensional scales will be designed and evaluated with the goals of high validity, reliability, sensitivity, specificity, and most importantly, the responsiveness to change. These methods also will support development of precise, reliable and automated computational models of voice quality perception.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute on Deafness and Other Communication Disorders (NIDCD)
Type: Research Project (R01)
Project #: 2R01DC009029-13A1
Application #: 10211924
Study Section: Motor Function, Speech and Rehabilitation Study Section (MFSR)
Program Officer: Shekim, Lana O

Project Start: 2007-07-01
Project End: 2026-02-28
Budget Start: 2021-03-01
Budget End: 2022-02-28
Support Year: 13
Fiscal Year: 2021
Total Cost
Indirect Cost

Institution

Name: University of South Florida
Department
Type
DUNS #: 069687242

City: Tampa
State: FL
Country: United States
Zip Code: 33617

Related projects

Publications

Kopf, Lisa M; Skowronski, Mark D; Anand, Supraja et al. (2017) The Perception of Breathiness in the Voices of Pediatric Speakers. J Voice :

Kopf, Lisa M; Jackson-Menaldi, Cristina; Rubin, Adam D et al. (2017) Pitch Strength as an Outcome Measure for Treatment of Dysphonia. J Voice 31:691-696

Eddins, David A; Anand, Supraja; Camacho, Arturo et al. (2016) Modeling of Breathy Voice Quality Using Pitch-strength Estimates. J Voice 30:774.e1-774.e7

Skowronski, Mark D; Shrivastav, Rahul; Hunter, Eric J (2015) Cepstral Peak Sensitivity: A Theoretic Analysis and Comparison of Several Implementations. J Voice 29:670-81

Eddins, David A; Kopf, Lisa M; Shrivastav, Rahul (2015) The psychophysics of roughness applied to dysphonic voice. J Acoust Soc Am 138:3820-5

Eddins, David A; Shrivastav, Rahul (2013) Psychometric properties associated with perceived vocal roughness using a matching task. J Acoust Soc Am 134:EL294-300

Patel, Sona; Shrivastav, Rahul; Eddins, David A (2012) Developing a single comparison stimulus for matching breathy voice quality. J Speech Lang Hear Res 55:639-47

Shrivastav, Rahul; Eddins, David A; Anand, Supraja (2012) Pitch strength of normal and dysphonic voices. J Acoust Soc Am 131:2261-9

Patel, Sona; Shrivastav, Rahul; Eddins, David A (2012) Identifying a comparison for matching rough voice quality. J Speech Lang Hear Res 55:1407-22

Shrivastav, Rahul; Camacho, Arturo; Patel, Sona et al. (2011) A model for the prediction of breathiness in vowels. J Acoust Soc Am 129:1605-15

Showing the most recent 10 out of 12 publications

Comments

Be the first to comment on David Eddins's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: