Annotated data sets are a necessity for data-driven speech and language processing approaches. Many of the speech and natural language processing tasks such as automatic speech recognition, question answering, machine translation, part-of-speech tagging, parsing, named entity extraction, and semantic role labeling have benefited significantly from shared tasks for benchmarking of algorithms and comparison of results on shared data sets. The goal of this project is to create a goal-oriented, mixed-initiative, naturally spoken human-machine spoken dialog system for conference services and publicize the spoken dialogs collected from this system for research purposes. The users can call a phone number and learn about the conference paper submission, program, venue, visa requirements, accommodation options and costs, etc.

We have an iterative approach, where the SDS is first deployed for the IEEE SLT workshop, to be held in December 2006, and all the components can be improved using the data collected from this deployment. Further data can be collected using the improved system for other conference/workshops.

Given that data-driven approaches are getting more popular for many speech and language processing applications, we believe that such a corpus annotated with system prompts, user utterance transcriptions, user intentions, overall task success, etc., would be a useful resource for dialog management, spoken language understanding, automatic speech recognition and other related tasks. These annotations can also be extended with user emotion tags, disfluencies, syntactic and semantic parses, etc. in the future.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
0624389
Program Officer
Tatiana D. Korelsky
Project Start
Project End
Budget Start
2006-04-01
Budget End
2007-09-30
Support Year
Fiscal Year
2006
Total Cost
$75,088
Indirect Cost
Name
International Computer Science Institute
Department
Type
DUNS #
City
Berkeley
State
CA
Country
United States
Zip Code
94704