This research is on the design of real-time two-party and multi-party VoIP (voice-over-IP) systems that can achieve high quality when the interactive conversation is perceived by human listeners. It focuses on the fundamental understanding of conversational quality and its trade-offs in strategies for scheduling the playback of speech frames received, concealing frames lost in the network, routing packets that carry speech frames using an overlay network, and admitting new connections in multi-party conversations. The perceptual quality of a conversation over a network connection depends on the one-way listening-only speech quality and the mouth-to-ear delay incurred from the mouth of a speaker to the ear of a listener. When there are network delays, a conversation perceived by a listener consists of speech segments that are separated by alternating short and long silence periods. This asymmetry leads to low perceptual quality in multi-party VoIP because some speakers appear to be more distant than others, whereas some respond slower than others. In this research, we develop a statistical method to collect subjective test results and a classification method to automatically learn and generalize the results to unseen network and conversational conditions. We study network-control and scheduling algorithms for improving the asymmetry in silence periods and the quality of speech segments received.

Project Start
Project End
Budget Start
2008-08-15
Budget End
2009-07-31
Support Year
Fiscal Year
2008
Total Cost
$98,405
Indirect Cost
Name
University of Illinois Urbana-Champaign
Department
Type
DUNS #
City
Champaign
State
IL
Country
United States
Zip Code
61820