How does a vision system recover the 3-dimensional structure of the world -- such as the layout of the environment, surface shape, or object motion -- from the dynamic 2-dimensional images received by the sensors in a camera, or the retinas in our eyes? This problem is fundamental to both computer and biological vision. Computer vision has developed a variety of algorithms for estimating specific aspects of a scene such as the 3-dimensional positions of points whose correspondence over time can be established, but obtaining complete and robust scene representations for complex natural scenes and viewing conditions remains a challenge. Biological vision systems have evolved impressive capabilities that suggest they have detailed and robust representations of the 3-dimensional world, but the neural representations that subserve this are poorly understood and neurophysiological studies thus far have provided little insight into the computational process. This project will pursue an interdisciplinary approach by attempting the understand the universal principles that lie at the heart of 3-dimensional scene analysis.

Specifically, the project will 1) develop a novel class of computational models that recover and represent 3-dimensional scene information, 2) collect high quality video and range data of dynamic natural scenes under a variety of controlled motion conditions, and 3) test the perceptual implications of these models in psychophysical experiments. The computational models will utilize non-linear decomposition - i.e., the ability to explain complex, time-varying images in terms of the non-linear interaction of multiple factors, such as the interaction between observer motion, the 3-dimensional scene layout, and surface patterns. Importantly, the components of these models will be adapted to the statistics of natural motion patterns that arise from observer motion through natural scenes and movement around points of fixation.

The project is a collaboration between three laboratories that have played a leading role in developing theoretical models of natural image statistics, visual neural representations, and perceptual processes. The investigators seek to combine their efforts to develop new models, data sets, and characterizations of 3-dimensional natural scene structure that go beyond previous studies of natural image statistics, and that can be tested in neurophysiological and psychophysical experiments. This project has the potential to bring about fundamental advances in neuroscience, visual perception, and computer vision by developing new classes of models that robustly infer representations of the 3-dimensional natural environment. It will create a set of high quality databases that will be made available to help other investigators study these issues. It will also open up new possibilities for generating realistic stimuli that can guide novel investigations of neural representation and processing.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
1111328
Program Officer
Kenneth C. Whang
Project Start
Project End
Budget Start
2011-09-01
Budget End
2015-08-31
Support Year
Fiscal Year
2011
Total Cost
$302,048
Indirect Cost
Name
University of Texas Austin
Department
Type
DUNS #
City
Austin
State
TX
Country
United States
Zip Code
78712