The proposed work creates human language resources to stimulate research in three important areas: (a) automatic language identification-- the problem of identifying the language being spoken; (b) robust speech recognition-- the problem of recognizing speech from different microphones, communication channels, and environmental conditions; and (c) speaker recognition-- the problem of identifying a speaker based on his or her voice characteristics. To support research in automatic language identification, over two minutes of speech from 300 speakers in each of twenty two language is being collected. To support research in robust recognition, about two minutes of cellular speech from several thousand speakers is being collected. To support research in speaker recognition, twelve hundred speakers are providing speech samples during ten telephone calls over a two year period. For each corpus, the speech is annotated orthographically, and a portion of each corpus is transcribed phonetically. A critical need for annotated speech corpora is met by this work: to stimulate research in three important areas of human language technology, leading to secure access to the information highway by any person who speaks a language.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
9529006
Program Officer
Ephraim P. Glinert
Project Start
Project End
Budget Start
1996-01-01
Budget End
1999-12-31
Support Year
Fiscal Year
1995
Total Cost
$746,890
Indirect Cost
Name
Oregon Graduate Institute of Science & Technology
Department
Type
DUNS #
City
Beaverton
State
OR
Country
United States
Zip Code
97006