In large-vocabulary spontaneous speech, the variability of the pronunciations of words is much higher than in read speech situations. At the 1996 Summer Workshop on Large Vocabulary Conversational Speech Recognition (WS96), a model for this variability to be used in Automatic Speech Recognition (ASR) systems was developed based on machine- derived descriptions of speech data. The continuation of this work in this grant focuses on studying the correlation of variation in pronunciations in continuous speech and higher-level information not usually brought to bear in an ASR pronunciation model. One important element in this model is the rate of speech, which has been shown to be a good predictor of word error rate on both read and spontaneous speech corpora. Investigations into the effects of resyllabification (movement of syllable boundaries when words are spoken in sequence) and word frequency on word pronunciations are also undertaken. The goal of this project is to improve the predictability of variation for speech recognition models, in particular for the reduction of recognition error for spontaneous and conversational speech. The techniques will be evaluated on the Switchboard corpus.

Project Start
Project End
Budget Start
1997-10-01
Budget End
1998-09-30
Support Year
Fiscal Year
1997
Total Cost
$38,133
Indirect Cost
Name
International Computer Science Institute
Department
Type
DUNS #
City
Berkeley
State
CA
Country
United States
Zip Code
94704