Every day, millions of people across the world take photos and upload them to social media websites. Their goal is to share photos with friends and others, but collectively they are creating vast repositories of visual information about the world. Each photo is an observation of how the world looked at a particular point in time and space. Aggregated together, these photos could provide new sources of observational data for use in disciplines like biology, earth science, social science or history. This project is investigating the algorithms and technologies needed for mining these large collections of photographs and noisy metadata to draw inferences about the physical world. The project has four research thrusts: (1) investigating techniques for identifying and correcting noise in metadata like geo-tags and timestamps, (2) developing algorithms for extracting semantic information from images and metadata, (3) creating methods for robust aggregation of noisy evidence from multiple photos, (4) validating these techniques on interdisciplinary applications in biology, sociology, and earth science.
The project is laying the foundation for using visual social media as a new source of observational data for a variety of scientific disciplines. The educational component is preparing students for the next generation of "big data" jobs through new undergraduate and graduate courses and online instructional materials. Undergraduate students (particularly from under-represented groups) are recruited to participate in the research program and encouraged to pursue scientific careers. An annual workshop is planned to educate general audiences, particularly senior citizens, about data mining and social media. Source code, datasets, course materials, and other results of the project will be disseminated to the public via the project web site (http://vision.soic.indiana.edu/career/).