How the brain estimates the 3D shape of objects in our surroundings remains one of the most significant challenges in visual neuroscience. The information provided by the retina is fundamentally ambiguous, because many different combinations of 3D shape, illumination and surface reflectance are consistent with any given image. Despite this ambiguity, the visual system is extremely adept at estimating 3D shape across a wide range of viewing conditions, something that no extant machine vision system can do. The long-term goal of the project is to develop a computational model in neural terms to explain how 3D shape is estimated in the primate visual system. It will build upon the responses of cells early in visual cortex (V1) and develop models of how they can be organized into mid-level configurations that specify 3D shape properties. Importantly, the project will also measure human perception of 3D shape in a series of psychophysical experiments designed to test specific predictions, bringing together the complementary expertise of Roland W. Fleming (Giessen University: human perception, psychophysics) and Steven W. Zucker (Yale University: computational vision, computational neuroscience). The results should provide a deeper understanding of visual circuit properties in the ventral processing stream; they should provide models for 3D computer vision and graphics; and they may pave the way for the development of rehabilitation strategies for patients with visual deficits.

The basic approach starts with populations of neurons tuned to different orientations and seeks to understand how these provide basic information about local shape properties according to the principles of differential geometry. Specifically, when 3D surfaces are projected onto the retina, the distorted gradients of shading and texture lead to highly structured patterns of local image orientation, or orientation fields, which can be inferred via circuits involving long-range horizontal connections. The investigators seek to derive formal models showing how these networks can be organized to infer 3D surface properties. The specific approach is involves four stages: (i) modeling how the visual system obtains clean and reliable orientation fields from the outputs of model V1 cells through lateral interactions and feedback; (ii) establishing how local measurements are grouped into specific "mid-level" configurations to support the recovery of 3D shape properties (modeling V2 to V4); (iii) modeling how these low- and mid-level 2D measurements can be mapped into representations of 3D shape properties (V4 to IT); and (iv) modeling how grouping and global constraints can convert these shape estimates into global shape reconstructions (again V4 to IT). Targeted psychophysical experiments will complement all of the modeling and test specific predictions from it. The resulting stimuli will support next generation neurophysiological experiments. Although the above stages define a working strategy, dependencies among these stages should also provide a model of the feedforward/feedback projections that link different areas of cortex. The ultimate goal is a model that can correctly predict the errors, the successes, and the limits of human shape perception.

This project is jointly funded by Collaborative Research in Computational Neuroscience and the Office of International Science and Engineering. A companion project is being funded by the German Ministry of Education and Research (BMBF).

National Science Foundation (NSF)
Division of Information and Intelligent Systems (IIS)
Application #
Program Officer
Kenneth C. Whang
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Yale University
New Haven
United States
Zip Code