This project exploits advances in parallel computing hardware and a neuroscience-informed perspective to design next-generation computer vision algorithms that aim to match a human's ability to recognize objects. The human brain has superlative visual object recognition abilities -- humans can effortlessly identify and categorize tens of thousands of objects with high accuracy in a fraction of a second -- and a stronger connection between neuroscience and computer vision has driven new progress on machine algorithms. However, these models have not yet achieved robust, human-level object recognition in part because the number of possible "bio-inspired" model configurations is enormous. Powerful models hidden in this model class have yet to be systematically characterized and the correct biological model is not known.
To break through this barrier, this project will leverage newly available computational tools to undertake a systematic exploration of the bio-inspired model class by using a high-throughput approach in which millions of candidate models are generated and screened for desirable object recognition properties (Objective 1). To drive this systematic search, the project will create and employ a suite of benchmark vision tasks and performance "report cards" that operationally define what constitutes a good visual image representation for object recognition (Objective 2). The highest performing visual representations harvested from these ongoing high-throughput searches will be used: for applications in other machine vision domains, to generate new experimental predictions, and to determine the underlying computing motifs that enable this high performance (Objective 3). Preliminary results show that this approach already yields algorithms that exceed state-of-the-art performance in object recognition tasks and generalize to other visual tasks.
As the scale of available computational power continues to expand, this approach holds great potential to rapidly accelerate progress in computer vision, neuroscience, and cognitive science: it will create a large-scale "laboratory" for testing neuroscience ideas within the domain of computer vision; it will generate new, testable computational hypotheses to guide neuroscience experiments; it will produce a new kind of multidimensional image challenge suite that will be a rallying point for computer models, neuronal population studies, and behavioral investigations; and it could unleash a host of new applications.
While humans are able to interpret complex visual scenes with great accuracy and speed, we do not yet know how to replicate these abilities in machines. Biologically-inspired vision algorithms seek to emulate what is known about the computational architecture of the brain in order to build better computer vision systems that can perform a variety of visual tasks, ranging from recognizing faces and objects, to identifying actions being performed in a video. The purpose of this project was to develop new algorithms for building better biologically-inspired vision algorithms. While we have learned much about the brain and its inner working in recent years, we still do not know enough to precisely guide how we build brain-inspired algorithms. Thus, in many cases, while researchers have clues about the broad-strokes that are likely to be important in these algorithms, the details are unclear. As a result, for any given set of ideas about how such systems should work, there is not just one algorithm, but instead a large family of possible algorithms. In this project, we leveraged recent advances in high performance computing to build tools for more effectively exploring the space of possible biologically-inspired algorithms. The outcomes from this project were threefold: First, we developed infrastructure for constructing and evaluating biologically-inspired models using high-performance computing clusters. This work has also resulted in new methods for optimizing software code, which are broadly applicable even beyond computer vision. Second, we developed a greater understanding of what aspects of biologically-inspired vision systems are most important to achieving good visual recognition performance, and what constitutes a good protocol for evaluating its true promise. In addition to guiding our development of biologically-inspired models towards better and better systems, this work holds the promise to ultimately feed-back to neuroscience, by providing insights about what properties found in effective artificial systems might also be found in the brain. Third, we used these new tools to build promising new solutions to key practical application areas in computer vision, including face recognition and activity recognition, demonstrating powerful commercial potential, and broader impact beyond academic research.