Recent advances in text summarization and speech recognition have not been paralleled by similar advances in speech summarization. This project on meeting summarization has three focuses. First, it investigates two different summarization task definitions, generic extractive summarization, and query-based summarization. Second, it addresses the core challenges that arise when simply applying text summarization techniques to speech recognition output. It evaluates the impact of low-level structural information (such as sentence boundaries and disfluencies), uses high-level meeting structural information (such as topics and meeting structure, speaker interaction), and uses rich recognition output (confidence measures in the recognition hypotheses, n-best and lattices) for summarization. Finally, various measurements are used to evaluate the effectiveness of summarization approaches, including comparing to human summary references, extrinsic metrics (e.g., based on a question-answering task), and human evaluation for the usefulness of the query-based summaries. This project employs advanced algorithms to combine well-motivated rich information from both speech and text for meeting summarization. An important outcome will be the findings about the usefulness of the summarization task for the meeting domain and development of new approaches to measuring success for this task. This work will advance the frontier of our understanding of human interactions and improve our ability to automatically process human speech. The annotated data and evaluation tools developed in this project will be shared with the community. This project is multidisciplinary, involving speech processing, natural language processing, and conversation analysis. The tight integration of research and education will significantly enhance the excellence of next-generation researchers.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
0845484
Program Officer
Tatiana D. Korelsky
Project Start
Project End
Budget Start
2009-07-01
Budget End
2014-06-30
Support Year
Fiscal Year
2008
Total Cost
$408,077
Indirect Cost
Name
University of Texas at Dallas
Department
Type
DUNS #
City
Richardson
State
TX
Country
United States
Zip Code
75080