This project investigates two deeply commingled and significant scientific questions on the statistical distributions of range, disparity, chrominance and luminance in natural 3D images of the world: (1) developing a comprehensive database of co-registered luminance, chrominance, range, and disparity images of natural scenes; and (2) conducting eye movement studies on stereoscopic images.. On the acquired database, the research team studies and models the bivariate statistics of luminance, chrominance, range, and disparity . In the eye movement studies, the locations of visual fixations are measured as they land in range space against where they land in luminance, chromatic, and disparity space, making it possible to develop gaze-contingent models of the statistics of luminance, chrominance, range, and disparity. The results of these studies have broad significance in vision science and image processing. To exemplify this, new approaches to computational stereo and to stereo image quality assessment are developed. New computational stereo algorithms are developed using appropriate prior and posterior distribution models on disparity. Further, new algorithms are developed for stereopair image quality assessment using the statistical models that we will develop. These new algorithms dramatically impact the emerging 3-D digital cinema, gaming, and television industries, allowing for automatic assessment of 3D presentations to human viewers. The developed 3D range-luminance databases are made available via public web portals, and the results of the work are published in the highest-profile vision science and image science journals.

Project Report

(1) We discovered heretofore unknown properties of the statistics of natural 3D images. In particular we created first-of-a-kind models of the conditional probability distributions of bandpass luminance images given bandpass range or disparity images using a co-registered database of luminance and range images. We found that the magnitudes of luminance and range/disparity coefficients show a clear positive correlation, which means, at a location with larger luminance variation, there is a higher probability of a larger range/disparity variation. As an example of the usefulness of luminance statistics conditioned on range/disparity statistics, we modified a well-known Bayesian stereo ranging algorithm using our natural scene statistics models, which improved its performance quite significantly. SEE FIG 1. These results are profound for understanding 3D modeling, 3D reconstruction, 3D recognition, 3D perception, 3D image quality prediction, 3D cinema and visual comfort, and many other fields. (2) We conducted 3D eye tracking experiments on naturalistic stereo images, and found the heretofore unguessed at and surprising result that fixated disparity contrast and disparity gradient are generally lower than randomly selected disparity contrast and disparity gradient. See FIG 2. Red is luminance gradient (luminance rate of change) and Blue is depth in 3D /disparity (depth rate of change).This result has profound implications for understanding 3D saliency, 3D computer vision algorithms, 3D reconstruction, 3D image and video quality, and 3D cinema. (3) We created a very high quality data set of coregistered color and range values collected specifically for this 3D natural scene studies, and we evaluated the statistics of perceptually relevant chromatic information in addition to luminance, range, and binocular disparity information. SEE FIG 3. The most fundamental finding is that the probabilities of finding range changes depend in a systematic way on color (our prior finding, see (1)). Our chromatic statistical distribution models were able to much further improve the performance of the Bayesian stereo algorithm as considered in (1) above, resulting in even fewer errors in matching using the Middlebury stereo database and criteria. These results are profound for understanding 3D modeling, 3D reconstruction, 3D recognition, 3D perception, 3D image quality prediction, 3D cinema and visual comfort, and many other fields. These are broad theoretical results of very wide applicability. (4) We conducted two human studies aimed towards understanding the perception of distorted 3D images by analyzing subjects’ performance in locating local distortions in stereoscopically viewed images. SEE FIG 4. We found that binocular suppression of visual distortion artifacts is observed while viewing blur, JPEG, and JP2K distorted stereo 3D images. This has deep implications for 3D image and video quality assessment, for digital 3D cinema, and for the development of 3D image and video compression protocols. (5) Based on the human studies in (4) we developed a Full Reference (FR) model for assessing the quality of stereoscopic images that have been afflicted by possibly asymmetric distortions. The resulting 3D Full Reference Image Quality Assessment (3D FR IQA) algorithm was shown to produce significantly better results than any prior model. The new model will have tremendous impact on the 3D image and video capture, communication, display, and entertainment fields, including 3D cinema, 3D television, 3D gaming, and 3D Internet. (6) Also based on the human studies in (4) we developed a first-of-a-kind No Reference 3D image quality assessment model that operates in distorted 3D images with either or both symmetric- or asymmetric distortions. SEE FIG 5. The algorithm derived from the model significantly outperforms all prior 3D FR IQA models. The new model will have tremendous impact on 3D wireless videos applications such as smartphones and tablets equipped with 3D, and on computer vision algorithms that use 3D, such as 3D face recognizers. (7) We developed a new subjective study methodology called Interactive Continuous Quality Evaluation (ICQE). ICQE enables better interactions in immersive 3D human studies by the use of tablets equipped with tactile and audio cues to enable to user to visual focus better on the stimulus, and to enable large-scale many-subjects studies to happen in single multi-subject settings. The new method will deeply impact vision science, human factors analysis, 3D video quality of experience, and 3D cinema and gaming. (8) Visual discomfort assessment (VDA) is predicting visual discomfort experienced when viewing stereoscopic images, mainly caused by the vergence accommodation conflict, leading to headaches, fatigue, eye strain and reduced visual ability. We developed a state-of-the-art model-based computational method for predicting VDA called Visual Discomfort Predictor which automatically predicts perceived visual discomfort. We compared the performance of VDP against other recent "leading" algorithms and show that VDP is statistically superior to all other prior methods. SEE FIG 6 - the three results on the right are all from our work. Competing algorithms with lower correleation/accuracy on the left. VDP will deeply impact 3D cinema, 3D television, 3D gaming, 3D internet, and etc.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
0917175
Program Officer
Jie Yang
Project Start
Project End
Budget Start
2009-07-15
Budget End
2013-06-30
Support Year
Fiscal Year
2009
Total Cost
$496,614
Indirect Cost
Name
University of Texas Austin
Department
Type
DUNS #
City
Austin
State
TX
Country
United States
Zip Code
78712