Passive Vision is the analysis of video from a camera that is not moving. Many cameras do not move, and continually watch a specific scene -- an ATM, an airport security desk, or a traffic intersection -- for months or years. Much as Active Vision (the ability to intentionally control camera motion) simplifies problems in structure from motion, Passive Vision simplifies statistical image analysis by observing statistics of the same scene for very long time periods.
This project develops a framework to study the statistics of fixed-viewpoint video. General statistics of natural video underlie current models of image and video compression and provide a statistical context for general image processing. But for video taken from a single viewpoint, the same analytic tools find much more specific statistical correlations, and these correlations relate to important scene features. For example, image regions that share geometric features such as surface normal and depth have a correlated responses to natural lighting changes. A tree waving in the wind tends to move all at the same time.
Furthermore, automated tools that develop statistics of specific video sequences, accumulated over time, promise to ground a number of probabilistic algorithms in surveillance. Surprisingly simple, local statistics of image derivatives find anomalous objects in scenes with significant background motions and find complicated patterns of motions of objects in a scene. Within surveillance, characterizing the statistics of background variations captured over weeks or months provides a foundation to more formally address questions of slow background drift (due to clouds, shadows, or seasons), and when or whether moving objects that stop should be included in the background.
This research program formalizes heuristic approaches to key problems in surveillance and offers a broader understanding of the statistics of natural images. This provides the foundation for a potentially large body of research in learning scene-specific algorithms for image representation and coding, image de-noising, object recognition, anomaly detection, and scene annotation --- key problems in using Computer Vision to address current Homeland Security needs.
Project web page: www.cse.wustl.edu/~pless/PassiveVision.html
This project explored ways to understand long-term image data captured of the same scene. Many cameras do not move, and continually watch a specific scene - an airport security desk, a beach, a volcano. In either case, the fact that all the images are from the same scene makes it easier to answer questions about what happening in each image. One basic question is: "where is the camera?" There are many live webcams broadcasting online from unknown locations; these cameras can be geo-located because the lighting and weather changes they observe depends on where the camera is. We offered one of the first approaches to geo-locate a time-series of images. The algorithm uses the fact that outdoor scenes have changes that are very consistent, and we extended this to also geo-calibrate (i.e. find the orientation and the zoom level) of cameras. Another question is "what is in the scene?" Others have tried to recognize objects by their appearance in one image, but we have explored what can be learned by measuring the time scale over which images change. Long term time-lapses gives an approach to automatically labeling scene locations (like trees) that vary over annual time scales, locations (like eastward facing walls) that are consistently brighter in the morning, or segmenting objects in a scene based on very small motions. At shorter time scales, we demonstrate the ability to build a 3D model of a scene from a time-lapse of clouds passing overhead, and the ability to automatically key in on important events, even when the background of the scene includes complex motions like waves crashing on a beach or traffic following a typical traffic pattern. To support our research, and the larger community, we have built and actively share a dataset called the Archive of Many Outdoor Scenes (AMOS). This archives and organizes available webcam imagery by discovering publicly available webcam feeds, indexing what was in each scene, geo-locating the imagery so we know where in the world it was from, and calibrating these cameras so that image measurements can be related to real world quantities. Collectively this research helps address national needs in homeland security and the need to understand long term changes in the environment.