Acoustic Correlates of Phonetic Perception

Hillenbrand, James

Abstract

Numerous practical and theoretical problems could be addressed if we had a better understanding of the auditory mechanisms underlying phonetic recognition. Among the practical applications of this knowledge are: (1) the improvement of speech synthesis devices, (2) the development of robust speech recognition devices, (3) the development of acoustically based training devices for hearing-impaired speakers, and (4) improvement in Cochlear-implant signal processors. The proposed experiments fall into three major categories. One set of experiments follows in a rather direct way from vowel perception studies conducted during the previous grant period. These experiments address issues such as the role of dynamic spectral cues and voice fundamental frequency in vowel perception. A second series of experiments address more fundamental issues regarding the spectral representations that control phonetic quality. A major goal of these experiments is to test the validity of a method of representing speech that was developed during the previous grant period. The """"""""Masked Peak Representation"""""""" (MPR) was developed as an alternative to both formant representations and whole spectrum models. The MPR involves a series of spectral manipulations that are designed to remove aspects of the spectrum that do not appear to have a strong influence on phonetic quality, while retaining those features that are most relevant to phonetic quality judgments. The MPR will be evaluated with: (1) an experiment comparing MPR-based predictions of perceived phonetic distance with those of a more traditional auditory model, (2) speech recognition tests that use a Hidden Markov Model to map sequences of MPR spectra onto words or phonetic segments, and (3) listening tests with speech resynthesized from MPR spectra. A third set of studies is aimed at modeling the low-level auditory mechanisms that are responsible for spectrum analysis. The goal of this work is to evaluate a model of spectrum analysis that is carried out by the central auditory system rather than the auditory periphery. A software simulation of the model will be developed in an effort to determine the extent to which the central-spectrum model can account for a broad range of findings from the auditory psychophysics literature. Experiments are also proposed that address the implications of this model for vowel perception and for the representation of pitch and periodicity.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute on Deafness and Other Communication Disorders (NIDCD)
Type: Research Project (R01)
Project #: 5R01DC001661-07
Application #: 2837963
Study Section: Special Emphasis Panel (ZRG1-HAR (01))

Project Start: 1992-07-01
Project End: 2001-11-30
Budget Start: 1998-12-01
Budget End: 1999-11-30
Support Year: 7
Fiscal Year: 1999
Total Cost
Indirect Cost

Institution

Name: Western Michigan University
Department: Other Health Professions
Type: Schools of Allied Health Profes
DUNS #

City: Kalamazoo
State: MI
Country: United States
Zip Code: 49008

Related projects

Publications

Hillenbrand, James M; Gayvert, Robert T; Clark, Michael J (2015) Phonetics exercises using the Alvin experiment-control software. J Speech Lang Hear Res 58:171-84

Hillenbrand, James M; Clark, Michael J; Baer, Carter A (2011) Perception of sinewave vowels. J Acoust Soc Am 129:3991-4000

Hillenbrand, James M; Clark, Michael J (2009) The role of f (0) and formant frequencies in distinguishing the voices of men and women. Atten Percept Psychophys 71:1150-66

Hillenbrand, James M; Gayvert, Robert T (2005) Open source software for experiment design and control. J Speech Lang Hear Res 48:45-60

de Wet, Febe; Weber, Katrin; Boves, Louis et al. (2004) Evaluation of formant-like features on an automatic vowel classification task. J Acoust Soc Am 116:1781-92

Hillenbrand, James M; Houde, Robert A (2003) A narrow band pattern-matching model of vowel perception. J Acoust Soc Am 113:1044-55

Kardach, Jill; Wincowski, Robert; Metz, Dale Evan et al. (2002) Preservation of place and manner cues during simultaneous communication: a spectral moments perspective. J Commun Disord 35:533-42

Hillenbrand, James M; Houde, Robert A (2002) Speech synthesis using damped sinusoids. J Speech Lang Hear Res 45:639-50

Hillenbrand, J M; Clark, M J; Nearey, T M (2001) Effects of consonant environment on vowel formant patterns. J Acoust Soc Am 109:748-63

Hillenbrand, J M; Clark, M J; Houde, R A (2000) Some effects of duration on vowel recognition. J Acoust Soc Am 108:3013-22

Showing the most recent 10 out of 18 publications

Comments

Be the first to comment on James Hillenbrand's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: