This action funds Darren Seibert of Massachusetts Institute of Technology to conduct a research project in Biological Sciences during the summer of 2013 at RIKEN in Wako, Japan. The project title is "Encoding Human Brain Activity Using a Multi-Staged Model of Visual Object Perception." The host scientist is Justin Gardner.
It is easy to take for granted our ability to visually perceive the world. Making sense of the vast amount of information transmitted nearly continuously from our retinae is not a trivial problem: it requires approximately one third of the human cortex to process. Our ability to effortlessly recognize a vast array of objects in tenths of a second, despite being cast on the retina with different illuminations, positions, and rotations, represents the end stage of the so-called ventral pathway. This project contributes to the reverse engineering of object recognition by expanding current prediction and fitting procedures. Specifically, it expands these procedures in order to better exploit the rich, spatially distributed response patterns in human functional magnetic resonance imaging (fMRI).
Broader impacts of an EAPSI fellowship include providing the Fellow a first-hand research experience outside the U.S.; an introduction to the science, science policy, and scientific infrastructure of the respective location; and an orientation to the society, culture and language. These activities meet the NSF goal to educate for international collaborations early in the career of its scientists, engineers, and educators, thus ensuring a globally aware U.S. scientific workforce. Furthermore, the fellow plans to make his analysis source code publicly available to aid other researchers and to facilitate learning.
Visual object recognition is supported by a network of neural circuitry located in the ventral stream. Substantial evidence shows that the ventral stream is a series of processing stages in which representations in successive areas encode semantic content (e.g. category, identity) more and more explicitly. Prior work has demonstrated that response properties in early visual areas are largely driven by edge and curvature selectivities. However, no such parsimonious descriptions have been found for the complex response properties of more anterior visual regions. Instead of taking the traditional "bottom-up" approach of searching for the visual features responsible for the complex response properties of anterior visual regions, we take a "top-down" approach. Our approach tested the extent to which a convolutional neural network optimized for object recognition performance can predict population responses of the ventral stream. We used functional imaging to capture human neural responses on image stimuli that have proved useful in exposing key object recognition challenges. We found that the model layers show remarkably high similarity across ventral retinotopic and object selective regions thought to be involved in core object recognition and that this similarity increases as the model becomes more optimal for recognizing objects. These findings demonstrated that performance is sufficient to drive a large convolutional neural network closer to human visual representations and suggest that the ventral stream may have been shaped by performance constraints. The broader impacts of this fellowship included first-hand experience in a research environment outside of the U.S. Specifically, this fellowship provided an introduction into Japanese scientific infrastructure, culture, society, and language. This fellowship met the goal of creating and encouraging international collaborations.