This project is exploring how capabilities for geometric image understanding can change the way people approach the problem of automatically interpreting the semantic content of individual photos or videos. By developing algorithms for accurately localizing cameras from images that integrate other sources of geo-spatial data, such as 3D models of buildings and maps of urban areas, the project aims to significantly improve the ability of computer vision systems to understand image content. Utilizing strong prior information for scene understanding has a wide range of important practical applications. An assistive robot providing elderly care in a home should leverage knowledge of the appearance and location of objects in its immediate environment while adapting to changes on multiple time scales (a coffee cup sitting on the table moves much more frequently than the table itself). A network of self-driving cars could benefit significantly from dynamically updated urban maps built from the stream of data collected by the cars and other cameras (e.g., adapting behavior to a temporary lane closure that changes typical car and pedestrian traffic patterns). The project involves students in research spanning a range of traditional disciplines and is engaging a wider audience across the UC Irvine campus in understanding and applying these technologies to novel social and scientific applications.

This research investigates an alternate approach in which scene priors (including affordances and semantic attributes) are represented in 3D geo-spatial model coordinates rather than in 2D image space. Incorporating geometric context into scene understanding has largely been pursued under very weak prior assumptions on scene geometry and camera pose. Importantly, the research allows for direct integration of non-visual data such as GIS maps. The project is developing the appropriate algorithms and datasets to integrating such data along with a continual stream of images to produce a strong, temporally-evolving (4D) scene prior that can improve accuracy of camera pose estimation, monocular geometry, object detection and semantic segmentation.

Project Start
Project End
Budget Start
2016-07-15
Budget End
2020-06-30
Support Year
Fiscal Year
2016
Total Cost
$377,407
Indirect Cost
Name
University of California Irvine
Department
Type
DUNS #
City
Irvine
State
CA
Country
United States
Zip Code
92697