A joint behavioral/modeling approach is used to better understand the top-down constraints that guide overt visual attention in realistic contexts. In previous work we developed a biologically-plausible model of eye movements during search that used oriented and color-selective linear filters, population averaging over time, and an artificial retina to represent stimuli of arbitrary complexity. The simulated fixation-by-fixation behavior of this model compared well to human behavior, using stimuli ranging from Os and Qs to fully realistic scenes. However, this model was limited in that it had to be shown the target's exact appearance, and it could not exploit scene context to constrain attention to likely target locations. Consequently, it is largely unknown how people shift their attention as they look for scene constrained targets or targets that are defined categorically. These limitations are addressed in six studies. Studies 1-2 explore how people use scene context to narrow their search for a specific target in realistic scenes. A text precue provides information about the target's location in relation to a region of the scene ("in the field";Study 1) or a scene landmark ("next to the blue building";Study 2). Behavioral work quantifies the effects of these informational manipulations on search guidance; computational work implements the behavioral scene constraints and integrates them into the existing search model. Studies 3-6 address the relationship between search guidance and the level of detail in a target's description. Study 3 builds on previous work by designating targets either categorically (e.g., "find the teddy bear") or through use of a preview (e.g., a picture of a specific teddy bear), but increases the number of target categories to determine the boundary conditions on categorical search. Study 4 asks whether categorical targets are coded at the basic or subordinate levels, and Study 5 analyzes the distractors fixated during search to determine the features used to code these categorical targets. In Study 6 we use text labels to vary the degree of information in a target precue (e.g., a work boot target might be described as "footwear", a "boot", or a "tan work boot with red laces"). Study 7 describes the sorts of questions that can be asked once scene constraints and categorical target descriptions are integrated under a single theoretical framework, and Study 8 points to an entirely new research direction made possible by the modeling techniques that will be developed for this project. All of these studies are synergistic in that model predictions are used to guide behavioral studies, which in turn produce the data needed to refine the model and to make even more specific behavioral predictions. The project's long term objective is to obtain an understanding of how people allocate their overt visual attention in realistic contexts, specifically in terms of how partial information about an object's location in a scene or its appearance can be used to acquire targets in a search task. This understanding is expressed in the form of a computational model, one that can now use simple spatial relations and the visual features of learned target classes to acquire semantically-defined targets.

Public Health Relevance

The attention system has been implicated in a host of neuropsychological disorders, and a visual search task is a key component in diagnoses of attention deficits. By increasing our understanding of the neuronal computations underlying overt search behavior, the proposed work is relevant to the public health in its potential to improve the validity of existing instruments for diagnosing attention disorders, and ultimately in better understanding these disorders so as to provide more effective treatments.

National Institute of Health (NIH)
National Institute of Mental Health (NIMH)
Research Project (R01)
Project #
Application #
Study Section
Cognition and Perception Study Section (CP)
Program Officer
Rossi, Andrew
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
State University New York Stony Brook
Schools of Arts and Sciences
Stony Brook
United States
Zip Code
Yu, Chen-Ping; Samaras, Dimitris; Zelinsky, Gregory J (2014) Modeling visual clutter perception using proto-object segmentation. J Vis 14:
Maxfield, Justin T; Stalder, Westri D; Zelinsky, Gregory J (2014) Effects of target typicality on categorical search. J Vis 14:
Schmidt, Joseph; MacNamara, Annmarie; Proudfit, Greg Hajcak et al. (2014) More target features in visual working memory leads to poorer search guidance: evidence from contralateral delay activity. J Vis 14:8
Zelinsky, Gregory J; Adeli, Hossein; Peng, Yifan et al. (2013) Modelling eye movements in a categorical search task. Philos Trans R Soc Lond B Biol Sci 368:20130058
Zelinsky, Gregory J; Peng, Yifan; Berg, Alexander C et al. (2013) Modeling guidance and recognition in categorical search: bridging human and computer object detection. J Vis 13:30
Alexander, Robert G; Zelinsky, Gregory J (2012) Effects of part-based similarity on visual search: the Frankenbear experiment. Vision Res 54:20-30
Schmidt, Joseph; Zelinsky, Gregory J (2011) Visual search guidance is best after a short delay. Vision Res 51:535-45
Zelinsky, Gregory J; Loschky, Lester C; Dickinson, Christopher A (2011) Do object refixations during scene viewing indicate rehearsal in visual working memory? Mem Cognit 39:600-13
Alexander, Robert G; Zelinsky, Gregory J (2011) Visual similarity effects in categorical search. J Vis 11:
Neider, Mark B; Zelinsky, Gregory J (2011) Cutting through the clutter: searching for targets in evolving complex scenes. J Vis 11:

Showing the most recent 10 out of 21 publications