Recognizing objects in images is the central problem of computer vision. One approach (``bag of words'') compiles simple statistics on the image brightness patterns and recognizes by correlating these statistics, via learning, with the imaged objects. It cannot exploit important information on the image's spatial layout. Another approach, perceptual organization (PO), computes a distinctive, structural description of the image contents and recognizes based on this. Researchers agree that PO is a crucial early stage of recognition--and that its results are unreliable. (Images compress the 3D world and are ambiguous; PO cannot eliminate the ambiguity since it has no high-level knowledge of what the image is ``about.'') This leads to a fundamental dilemma: How can a recognition system use the result of PO if it cannot be trusted?

To exploit perceptual organizations without succumbing to their unreliability, this project uses a strategy that averages over all possible organizations weighted by their probability, instead of computing a single, ``most likely'' image description. This strategy is applied to diverse tasks such as matching images by the shapes of the objects within; recognizing articulated objects such as people and animals; tracking objects through video; and computing stable perceptual organizations. The project also studies the integration of this approach with older ones into a flexible and capable recognition system. The result will be new techniques for the analysis, manipulation, and search of images. The methods developed will be integrated in the curriculum and disseminated to researchers, and the software will be made publicly available.

Project Start
Project End
Budget Start
2009-09-15
Budget End
2014-08-31
Support Year
Fiscal Year
2009
Total Cost
$376,685
Indirect Cost
Name
Stevens Institute of Technology
Department
Type
DUNS #
City
Hoboken
State
NJ
Country
United States
Zip Code
07030