Recent successes in computer vision, machine learning, and combinatorial optimization are leveraged to tackle once more the image interpretation problem as defined in the early days of computer vision. Interpretation is cast as a problem of simultaneous image segmentation and region classification: Given an image and a list of class labels, the goal is to compute the most probable image segmentation and labeling, one label per segment. This is learning in context in that learning techniques are used to recognize several objects in the context of complex, cluttered images
A manual image labeling method is proposed that enlists the help of both web surfers and the students in a junior-high school and a high school to tackle this labor intensive task, while at the same time exposing young pupils to computer vision research.
The proposed work has intellectual merit of relevance to the fields of computer vision, artificial intelligence, and cognition in general. In particular, the notion of defining ``words for pictures'' that this proposal offers may establish a new, fruitful bridge to text retrieval research and widen the discourse computer vision has been entertaining with other areas of science.
The understanding of visual perception in its more semantic sense of ``image interpretation'' will undeniably have a broader impact on society. From a practical point of view, image understanding systems are useful for information retrieval, surveillance, medical imaging, and in many other endeavors. In addition, the proposed activities include collaboration with industry and government agencies and involve postdocs, graduate and undergraduate students. These activities also explicitly involve younger pupils in grades 6 through 12, and will hopefully help attract them to computer vision.
0535152/0535166
This project addresses the problem of category-level object recognition in images: Its aim is to develop effective methodologies for representing object classes; learning the corresponding object models from cluttered sample images in a semi-supervised manner; and efficiently and robustly recognizing instances of these models in novel images despite clutter, occlusion, viewpoint and illumination changes, and individual variations within each class.
Intellectual Merit. The scientific objective of this project is to develop a representation of the salient parts of an object and their relationships that can effectively be learned from heavily cluttered data in a weakly supervised way, correctly captures within-class variability and appearance changes due to variations in viewpoint and illumination, and effectively supports inference over object models and the automated construction of efficient classification machines.
Broader Impacts. This project will investigate applications of category-level object recognition to image retrieval, video annotation, human-computer interaction; surveillance and security; and robotics via international academic and industrial collaborations. Contributions to education and outreach will include training PhD students and post-doctoral researchers, and involving underrepresented groups in graduate research and undergraduate data collection and empirical evaluation projects.