Retinal images are inherently fragmentary and ambiguous because images of separate entities overlap. But the early visual mechanisms are not equipped to parse the overlapping 2-D retinal images into distinct 3-D entities. The job of parsing these images falls on the mid-level mechanisms, whose main role is to represent the distinct entities as separate surfaces. The represented surface information then serves as inputs to the WHAT and WHERE systems that underlie our 3-D perception of objects and space, respectively. As such, the mid-level mechanisms are not just simple "conduits" of information between early and late level visual mechanisms but play a crucial role in determining the quality and reliability of the visual information conveyed. Compared to other aspects of visual processing, less is known about the mid-level mechanisms. One of the biggest challenges is to discover how the often fragmentary and ambiguous retinal information is transformed into reliable surface representations, presumably, through a spreading-in operation. At times, when an image belonging to the same entity is broken into parts due to occlusion, a surface interpolation operation is required to integrate the parts into a global surface. Moreover, inputs from the two eyes that contribute to these operations can be disparate in content and location. In the face of the myriad complexities of the visual inputs, it is further proposed that the mid-level mechanisms must rely on internal assumptions (perceptual rules) and feedbacks from the higher visual levels for guidance in representing surfaces. But how these operations are accomplished is still unclear. Remedying it, this proposal uses the human psychophysical approach to investigate the above issues by focusing on three specific aims.
Aim 1 investigates how the spreading-in operation represents surfaces with texture patterns, which is more complex than representing texture-free surfaces. It is proposed the principle of reducing coding redundancy that governs the spreading-in operation causes the global surface representation operation to be efficient but prone to poor resolution. The latter could be one basis of the well-known "crowding effect" phenomenon.
Aim 2 investigates the texture-surface interpolation operation. Cognizant of the roles of attention and object knowledge, the research investigates how these top-down factors influence surface integration.
Aim 3 investigates the long-term plasticity of the mid-level mechanisms. Perceptual learning experiments will be conducted to reveal how extensive training modifies the perceptual rules implemented at the mid-level. The long-term goal of this proposal is to advance our knowledge of how visual information is processed and represented by the mid-level mechanisms. This knowledge helps us better understand how humans perceive the visual world, and provides a clinical basis for behavioral diagnoses and treatments of visual dysfunctions related to amblyopia, strabismus and aging.
Early level visual information is often fragmentary and ambiguous because images of distinct entities overlap. A role of the mid-level mechanisms is to make sense of this information by representing the distinct entities as separate surfaces. Discovering how this is achieved leads to better scientific understanding of how humans perceive the visual world, and provides a clinical basis for non-invasive diagnoses and treatments of visual dysfunctions related to amblyopia and strabismus.