The problem of visual recognition is fundamental towards the goal of automatic image understanding. While a large number of efforts have been made in the computer vision community, machine performance at these tasks remains significantly inferior to human ability. The overarching goal of this project is to leverage the best known visual recognition system - the human visual recognition system. This project employs a "Human Debugging" paradigm to replace various components of a machine vision pipeline with human subjects, and examines the resultant effect on recognition performance. Meaningful comparisons provide valuable insights and pinpoint aspects of the machine vision pipeline that are performance bottlenecks and require future research efforts. Specifically, the project considers the problems of image classification and object detection, and explores the roles of local and global information, as well part-detection, spatial modeling and contextual reasoning (including non-maximal suppression) for these problems respectively. This project touches on a wide range of problems in visual recognition including object recognition, scene recognition and object detection. This novel paradigm of identifying weak links in computational models via humans in the loop is also applicable to other vision problems, as well as other sub-fields in AI. By sharing all collected data and results, and through organized conferences and workshops, this project will initiate and fuel a dialogue with the research community about leveraging humans to advance computer vision. More broadly, this work encourages the involvement of young women and undergraduate students in computer science research.

Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Toyota Technological Institute at Chicago
United States
Zip Code