Experience is thought to play a critical role in shaping the cortical representations that support object recognition by creating neural responses are selective for some dimensions of change and invariant to others. Although many previous studies have examined the effects of supervised training on object selective regions of the brain, much less is known about the degree to which statistical regularities in the retinal input can directly shape the neural substrates involved in object recognition. Unsupervised learning is important because it allows the brain to employ simple self organizing mechanisms that turn the continuous flux of visual input into the stable objects of our experience. While behavioral and computational work strongly suggests that unsupervised learning plays a key role in object recognition, most related neuroscience work examining the role of input statistics has focused on its effects in early visual areas. Here we propose experiments that combine cutting edge techniques in fMRI, psychophysics, and computational modeling to examine two hypotheses concerning unsupervised learning in object recognition. First, we propose that neural responses may become tuned to match the range and frequency of shape and object exemplars experienced during unsupervised training. That is, neural responses will increase and become more selective for items seen more frequently during unsupervised training relative to infrequently seen or untrained items. This may provide a mechanism which improves discrimination performance for stimuli seen most frequently. Second, behavioral and computational evidence suggests the intriguing hypothesis that the brain uses spatio-temporal correlations as a means for binding different images as belonging to the same object, allowing for recognition of the same object across dramatic transformations, such as changes in its appearance due to rotation. We will determine if spatio- temporal correlations in the visual input during unsupervised training increases the invariance of both brain responses and perceptual performance relative to similar items trained in an uncorrelated manner and pre- training responses (and performance). Third, we will examine if mechanisms of unsupervised learning generalize to supervised learning. In all of our experiments we will examine neural responses and performance both before and after unsupervised training, and use computational modeling to link fMRI data to the possible underlying neural mechanisms such as sharpening of neural tuning and increased firing rates. The proposed work will fill important gaps in knowledge by providing the first account of the neural mechanisms that generate effective representations for object recognition from the statistics of visual experience.

Public Health Relevance

The results of these studies will be important for understanding the role of visual experience in shaping normal visual representations. As these mechanisms do not require explicit instruction, they are especially important for unraveling the means by which pre-verbal children and animals learn to recognize objects. Understanding these mechanisms will form a much needed foundation for studying development disorders such as congenital prosopagnosia, autism and Williams Syndrome. Further, if we find significant behavioral improvements due to the statistics of the visual inputs, these training paradigms may be used as an intervention to offset developmental visual disabilities.

National Institute of Health (NIH)
National Eye Institute (NEI)
Research Project (R01)
Project #
Application #
Study Section
Central Visual Processing Study Section (CVP)
Program Officer
Steinmetz, Michael A
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Stanford University
Schools of Arts and Sciences
United States
Zip Code
Weiner, Kevin S; Yeatman, Jason D; Wandell, Brian A (2017) The posterior arcuate fasciculus and the vertical occipital fasciculus. Cortex 97:274-276
Tian, Moqian; Yamins, Daniel; Grill-Spector, Kalanit (2016) Learning the 3-D structure of objects from 2-D views depends on shape, not format. J Vis 16:7
Hammer, Rubi; Sloutsky, Vladimir; Grill-Spector, Kalanit (2015) Feature saliency and feedback information interactively impact visual category learning. Front Psychol 6:74
Tian, Moqian; Yamins, Dan; Grill-Spector, Kalanit (2015) Learning invariant object representations: asymmetric transfer of learning across line drawings and 3D cues. J Vis 15:1088
Winawer, Jonathan; Witthoft, Nathan (2015) Human V4 and ventral occipital retinotopic maps. Vis Neurosci 32:E020
Tian, Moqian; Grill-Spector, Kalanit (2015) Spatiotemporal information during unsupervised learning enhances viewpoint invariant object recognition. J Vis 15:7
Miller, Kai J; Hermes, Dora; Witthoft, Nathan et al. (2015) The physiology of perception in human temporal lobe is specialized for contextual novelty. J Neurophysiol 114:256-63
Witthoft, Nathan; Winawer, Jonathan; Eagleman, David M (2015) Prevalence of learned grapheme-color pairings in a large online sample of synesthetes. PLoS One 10:e0118996
Witthoft, Nathan; Nguyen, Mai Lin; Golarai, Golijeh et al. (2014) Where is human V4? Predicting the location of hV4 and VO1 from cortical folding. Cereb Cortex 24:2401-8
LaRocque, Karen F; Smith, Mary E; Carr, Valerie A et al. (2013) Global similarity and pattern separation in the human medial temporal lobe predict subsequent memory. J Neurosci 33:5466-74

Showing the most recent 10 out of 18 publications