Automated telephone dialog systems rely disproportionately on accurate transcription of the speech signal into readable text. When the system has low confidence in the automatic speech transcription (ASR) of a caller's utterance, a typical dialog strategy requires the system to repeat its best guess, and ask for confirmation. This leads to unnatural interactions and dissatisfied callers. The current project focuses on developing better dialog strategies given current ASR capabilities by learning automatically from contrasting corpora, and comparing the results. Using a novel methodology, wizard ablation, simulated human-system dialogs are collected that vary in controlled ways. The testbed application, an Automated Readers Advisor for New York City's Andrew Heiskell Talking Book and Braille Library, has appropriately limited complexity, and potentially broad social benefit.
The motivation for wizard ablation is that research is needed into the problem-solving strategies humans would use if the human communication channel were restricted to be more like a machine's. In conventional wizard-of-oz studies, unsuspecting users interact with human wizards "behind-the-screen", thus providing data on the way humans interact with (what they believe to be) machines. Unlike a conventional wizard, an ablated wizard is restricted to seeing the ASR input to the system dialog manager. Under a further ablation condition, the wizard must choose actions from the repertoire that the system uses, but can combine them freely. The book-borrowing scenarios for the wizard interactions have been designed to be realistic, and Heiskell Library patrons participate in the studies. The collected dialogs will be made available to the community.