This project involves a fundamental research program in video analysis that is built upon a novel framework for integrating different independent modalities of image formation - illumination conditions, object shape, motion and surface reflectance. We focus on the problem of pose and illumination invariant object recognition in video, based on learned models of lighting, motion and shape. We also develop novel scene relighting methods using the illumination models learned from natural videos. The proposed research proceeds by first developing a mathematical framework that relates the appearance of a video sequence with the lighting conditions, motion of the objects being imaged, and their shape and surface properties. Thereafter, it addresses the inverse problem of recognition from video. The overall mathematical approach is to combine precise geometrical models with statistical data analysis tools, thus combining accuracy and robustness.
This research will benefit a large number of existing applications and create new ones that rely on efficient object tracking under changing environmental conditions. They include applications in national priority areas like homeland security, monitoring of nuclear installations and border security, in commercial interests like video communications, multimedia databases and entertainment, in social causes like wildlife and environmental monitoring, and in medical and biological applications relying on video analysis. Scene relighting by learning the motion and illumination of objects from natural videos will also have a significant impact in creation of realistic virtual environments.
Progress on this project will be regularly updated at www.ee.ucr.edu/~amitrc/JMIS.htm
The automated analysis of huge amount of video data being collected daily is essential in order to extract meanigful information from it. Application domains could include security and surveillance, disaster response, environmental monitoring or biomedicine. Robustness of existing video analysis algorithms is widely accknowledged to be a major bottleneck in the adoption of these automated techniques. This project looked into the fundamental challenges that need to be overcome in order to develop robust video analysis algorithms. We achieved a few specific goals in this regard. - For the problem of face recognition, we developed mathematical models that describe the image formation process and used these models for estimating facial pose, expression and motion. Such approaches can lead to improved face recognition techniques. - For the problem of recognizing human activities in video, we showed how global information about a scene can be used to improve the ability to recognize the activity of a local target. This is often referred to as context-aware activity recognition. We have provided fundamental methods and software to achieve. this. In biomedical applications, cell tracking is a fundamental problem. The goal is to obtain long cell lineages from where the gorwth patterns can be identified. Building upon the concept of global information in the above activity recognition problem, we showed how cell tracking can be made more efficient by considering not only the local characterisitcs of each cell, but also its relationship to other cells. Consistency constraits were imposed in both the spatial and temporal directions. This led to significantly improved results over the state of the art. The fundamental research results have been published in the most selective conferences and journals. Software and datasets corresponding to all the three major outcomes reported above have been released (or will be soon) to the broad research community. The PI has given related talks in different venues to the broad public and his work on face recognition has been featured in PBS/National Geographic.