Using modern learning techniques, it is now possible to teach computers visual concepts through example-based learning. But this process is time consuming and arduous. Often large data sets must be manually collected. Machines typically do not take advantage of previously learned knowledge when performing new tasks. And when confronted with a new situation, systems fail catastrophically. The goal of this research is to make it dramatically easier to teach vision systems new skills, and to design machines that can learn tasks faster by leveraging previously learned knowledge. In short, the aim is to develop computer vision systems that are largely self-taught. More specifically, this research will focus on problems such as learning from a small number of examples; using previously learned knowledge to improve performance on novel tasks; learning properties of one object that can be used to make inferences about other objects; acquiring and organizing information autonomously; and leveraging interdisciplinary techniques to help relieve people from the burden of ``training'' computers.

These capabilities are taken for granted in human beings, but represent serious shortcomings in today's computer systems. A central tenet of this work is that it is impractical to train vision systems one problem at a time, acquiring large training sets and developing training paradigms for each task to be learned. There are many scenarios in which training data are severely limited (there are limited photos of Abraham Lincoln). And ideally, computer systems should be adaptive, and not have to be prepared for each new task, especially when these new tasks are similar to previous ones. Some specific areas of investigation include learning to recognize any particular car or face from a single example, simply by watching other cars or faces as they move about; developing software for robots to continously explore the visual world and the interactions between vision and the other senses; and learning to recognize typewritten text in a font never seen before, without ANY training examples of that font. The common thread in these efforts is that they relieve the burden on the teacher of the computer. The final goal is to develop computers that can be taught simply and rapidly, and that can explore on their own.

Educational initiatives will be developed in two areas. The first area is minority and low-income outreach, involving a group of students at an urban Massachusetts school. The second area involves curriculum development and curriculum guidance at the college and graduate levels at UMass, Amherst.

Project web page: www.cs.umass.edu/~elm/CAREER

Project Report

The research conducted for this grant has been in several major areas, including 1) face recognition and face detection, 2) scene text recognition, and 3) the basic science of image comparison. Our work in face recognition and face detection has resulted in several contributions. One of our most important contributions was to shift the emphasis of research in the community from recognizing faces in a carefully controlled laboratory environment, where factors such as the lighting of a photograph, and the expression on a subject's face are carefully controlled, to a setting in which these variables are not carefully controlled. To promote the idea of analyzing face pictures "in the wild", or in a natural setting, we published a large database of natural photographs of people to be used as a test benchmark in face recognition. This database, called Labeled Faces in the Wild, is available publicly, and has been successful in driving research on the problem of real-world face recognition. We have also contributed several original methods for recognizing faces, for detecting faces, and for jointly processing faces and captions in news photographs. Each of these methods, at the time of publication, represented the state-of-the-art in the different sub-problems of face recognition. The fact that 2 of the 3 of these algorthms have been surpassed by other researchers is partly due to the success of our databases in stimulating research on difficult, real-world, face recognition problems. Another area of research we have worked in involves the recognition, by computer, of letters and words that would be seen on signs in an outdoor setting. We call this problem "scene text recognition". This might involve recognition a "walk signal" at a crosswalk, the name of a coffee shop, or the sign over a post office. One application for such computer reading of outdoor text would be as an aid to the blind or visually impaired. While this problem is very easy for sighted people to solve, it is quite difficult for computers to solve. Our group has made several advances on this problem, and some of these methods have been adapted by other groups in ongoing efforts to improve computer performance in the recognition of text. Another recent area of emphasis in our research has been in the basic problem of learning how to compare the similarity of two images. Consider the images of two different faces. A computer might wish to decide that if the two images are "similar enough", then the two images must depict the same person. However, the difficult part of this problem is that there are many many ways to decide upon the similarity of two images. Should they have the same colors, the same edges, the same swatches of texture? We have produced a new way of thinking about the problem of image similarity. We can measure its success by noting that using this novel notion of image similarity, we are able to build tracking systems that perform better than any previous tracking system. The CAREER award gave our group the flexibility to work on a variety of problems which all used statistical methods to produce better performance on a variety of traditional vision problems, from face recognition, to text recogntition, and tracking.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
0546666
Program Officer
Jie Yang
Project Start
Project End
Budget Start
2006-03-15
Budget End
2012-02-29
Support Year
Fiscal Year
2005
Total Cost
$513,498
Indirect Cost
Name
University of Massachusetts Amherst
Department
Type
DUNS #
City
Amherst
State
MA
Country
United States
Zip Code
01003