We live in a world of ubiquitous imagery, where the number of images at our fingertips is growing at a seemingly exponential rate. These images come from a wide variety of sources, including Internet mapping sites, webcams, surveillance and reconnaissance cameras, and millions of photographers around the world uploading billions and billions of images to photo-sharing websites. Taken together, these sources of photos can be thought of as constituting a distributed camera capturing the entire world at unprecedented scale, and continually documenting its cities, mountains, buildings, people, and events.
This research is creating the basic computational tools for "calibrating" this distributed camera, through use of a world-wide database of 3D models built from Internet photo collections using computer vision techniques. The focus is on creating faster, more robust algorithms for 3D reconstruction from unstructured photo collections, as well as techniques for world-scale pose estimation--computing precisely where in the world a photo was taken from image data alone. These tools are yielding large, world-wide databases of calibrated imagery that can help answer questions in science (e.g., finding all available photos of Central Park so as to track flowering times of different plants) and engineering (e.g., finding all of the photos ever taken of a particular bridge to help figure out why it collapsed), and have impact other areas including security, consumer photography, and multimedia search. This research is closely integrated with education and outreach, and includes plans for a summer workshop for high-school students to engage with 3D vision technologies.