This project develops incremental language processing techniques to enable spoken dialogue systems to communicate in a way that is more highly interactive, more efficient, and more human-like. Most previous dialogue systems have employed a strict turn-taking regime in which one person speaks at a time, no attempt is made to understand or respond to speech until the speaker finishes speaking, and the overall latency in system responses is high in comparison to human-human conversation. This results in systems that are unable to provide a range of rapid and overlapping responses that human interlocutors frequently use to achieve an efficient and successful communication process, including back-channels, interruptions, collaborative completions, clarifications, and other rapid responses. This project is a computational and empirical investigation into how a system's assessment of its own incremental understanding of ongoing user speech can guide its strategic decisions to initiate such rapid and overlapping responses. The feature representations and response policies that can implement this decision-making are studied in the context of two fast-paced interactive dialogue games. These games are carefully chosen to support objective evaluation of incremental response strategies and fun gameplay that facilitates large-scale data collection.
The resulting computational models may improve the conversational skills of a range of dialogue systems, including not only game-oriented systems but also practical applications such as intelligent tutoring and training systems, information access systems, and entertainment applications. A second product of this project is an annotated corpus of human-human and human-system dialogue data for use by other researchers. A third product is the incorporation of relevant software into a publicly distributed toolkit for building dialogue systems, supporting further research and education.