Determining the perceptual quality of video transmitted through complex networks and viewed on heterogeneous platforms, from cell phones to Internet-based television, is a key problem for the YouTube generation. It is also central to a variety of vision applications including face detection, face recognition and surveillance. Video is subject to numerous distortions: blur, noise, compression, packet/frame drops, etc. Quality assessment is non-trivial when an undistorted video is not available, and unsolved for multiple distortion types and in distributed, non-stationary viewing environments.

This project designs and creates intelligent video "quality agents" that learn how to determine perceptual video quality in heterogeneous networks, and assesses its impact on decision tasks such as face detection and recognition, all without the benefit of reference videos. It uses statistical properties of natural scenes, perceptual principles, machine learning, and intelligent adaptive agent collectives to handle videos simultaneously impaired by multiple distortion types. A primary application is novel face-salient quality assessment agents and quality-aware face detection algorithms. Multiple, co-operative video and face quality agents are trained using active learning based feedback mechanisms on mobile devices. This project yields adaptive, robust video Quality of Service assessment in real-life networks and provides new insights into human visual quality perception and visual distortion detection. The research team also creates two large, unique video quality databases: (a) A Mobile Video Quality Database of raw and distorted mobile videos and (b) A Distorted Face Database of undistorted and distorted face images, as gold standards for research and development in this area.

Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Texas Austin
United States
Zip Code