Dialogue Prosody in Interactive Voice Response Systems

Hirschberg, Julia; Ward, Gregory

Abstract

The unnaturalness of synthesized speech found in current interactive voice response (IVR) systems is due to the lack of natural prosodic variation. While state-of-the-art IVR systems are often highly intelligible and may sound natural for short prompts or when the text to be spoken is close to speech recorded for the system's database, once they deviate from these narrow bounds, results range from "boring and mechanical" to "odd and confusing." To address these deficiencies, the PIs are developing a new method for learning contour assignment for dialogue systems that avoids the sparse data problem without massive new annotation. They have generated a series of hypotheses about which features of the dialogue context influence human speakers' choice of contour from corpora and are testing these hypotheses via a series of targeted laboratory experiments. By designing a carefully controlled set of production and perception studies, the PIs will be able to determine which intonational features prove to be most reliably correlated with contour choice and which are perceptually most salient for listeners.

From a practical viewpoint, an IVR system that incorporates the appropriate assignment of full intonational contours will greatly enhance the perceived naturalness of the system. From a scientific viewpoint, such a model will expand our understanding of how speakers use and hearers interpret intonational contour variation. From a social viewpoint, the creation of IVR systems that interact with users naturally will increase the acceptability of such systems, bringing the vision of ubiquitous access to information and services for all closer to reality.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Application #: 0307905
Program Officer: Tatiana D. Korelsky

Project Start
Project End
Budget Start: 2003-07-01
Budget End: 2007-06-30
Support Year
Fiscal Year: 2003
Total Cost: $542,303
Indirect Cost

Dialogue Prosody in Interactive Voice Response Systems
Hirschberg, Julia Ward, Gregory
Columbia University, New York, NY, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments