This project develops an integrated framework to perform simultaneous object discovery and detector training in an unsupervised setting. It takes advantages of large amount (millions or even billions) of well-organized internet images to automatically learn rich image representations for a wide range of objects. The main activities in this project include the following. (1) The central component of this project is a formulation to turn unsupervised data into weakly-supervised "noisy input" through which commonalities are explored for rich object representation using a new learning method. (2) A large dictionary of mid-level image representations will be learned on a large scale number of images retrieved using thousands of object words through the internet search engine. (3) A new flexible object representation is developed to deal with articulated/non-rigid objects.
The project advances computer vision and machine learning fields by developing an unsupervised paradigm to explore a large scale of internet images. The learned mid-level and high-level representations from images retrieved using thousands of words can significantly enhance the object representation power and benefit researchers in the object recognition field. The formulations, algorithms, and methods resulted from this project are also helpful to researchers in other fields such as medical imaging and data mining. The project dissemination plan includes the source code and learned mid-level and high-level representations.