In this project, the research team explores several research challenges to exploit the relationship between images, video, and the people viewing this visual imagery. Areas of exploration include: 1) behavioral experiments to better understand the relationship between human viewers and imagery, 2) development of human-computer collaborative systems for image and video understanding that utilize automatic computer vision algorithms in conjunction with active and passive cues from human viewers, and 3) implementing retrieval and collection organization applications using our collaborative models.
Billions of images and millions of videos are now available online via the infrastructure of amazingly successful companies from Google to Microsoft to Facebook. This wealth of visual data is creating considerable opportunities for communication and community, and tightening the social fabric of our world. In parallel to this explosion in online imagery, there is also an increasing proliferation of cameras viewing the user, from the ever present webcams peering out at us from our laptops, to cell phone cameras carried in our pockets wherever we go. This record of a user's viewing behavior, particularly of their eye, body movements, or descriptions, can provide enormous insight into how people interact with images or video, and can inform construction of more effective visual applications such as image or video retrieval. In addition, understanding what people recognize, attend to, or describe about an image or video is a necessary step toward high level goals of human centric image understanding that will have research benefits to many diverse fields, including computer vision and behavioral science.