This award is made in support of a collaborative project called "Frontiers in Activity Recognition" whereby a group of experts from different fields of computer science, engineering, mathematics and statistics convene in a workshop to be held in the vicinity of UCLA. One component of the workshop consists in interactive break-out sessions where different approaches to activity representation (descriptors) and recognition will be analyzed. A second component consists in a competition, announced to the broad public ahead of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), whereby an extensive dataset provided by a third party will be released, with benchmarks, and contestants will be invited to submit their best results in the detection of a number of action categories. The proposers of high-ranking approaches will be invited to the workshop to present their results and discuss it in the context of the analysis of the state of the art to be performed as part of the field assessment. The workshop can have broad impact to many applications ranging from security (surveillance, monitoring) to environmental science (habitat monitoring, global warming), to industrial operations (factory floor optimization), to multi-media and information retrieval (content-based video meta-data extraction), to entertainment (input devices for games), and to transportation (driver assistance).
Automated analysis of video is becoming increasingly important in applications ranging from security to environmental monitoring, to industrial engineering, to multi-media and content-based retrieval, entertainment, transportation, just to mention a few. In particular, the analysis of actions and events has emerged as a particularly challenging domain, where significant activity is underway, and yet progress has been elusive in many scenarios other than the most structured ones. The Workshop on Frontiers of Activity Recognition was held on June 29-30, 2010, in Los Angeles, with the goal of bringing together the foremost experts on Activity Recognition in Video, to assess the state of the art, identify challenges and opportunities, and chart the path ahead. In addition to a number of invited participants, a Challenge and Dataset were issued and circulated worldwide prior to the Workshop, inviting researchers to submit their results on a new benchmark dataset recently released under the aegis of DARPA's VIRAT Program. The dataset was made publicly available, and several contestants entered their results, from which a number of entries were selected for presentation in the Workshop. Among the key issues in need for development, representational ambiguity was identified as the most challenging, as significant within-class variability has to be captured and modeled: The same action can manifest itself in a wide variety of ways depending on the actor, the viewer, environmental conditions including clutter, illumination and other nuisance factors. Many actions occur at different time scales, and involve the interaction among multiple people, or people and objects. Also the absence of a clear taxonomy and unambiguos classification of actions makes it difficult to develop benchmarks, to evaluate existing schemes, and to quantify progress. During the workshop, the current state of the art was surveyed and critiqued, and failures and promising developments identified. Following parallel progress in the field of object recognition, it was determined that progressive challenges, where researchers compete on benchmarks of increasing difficulty, are desirable, although the limitations of such challenges were evident in the results of the preliminary competition on the dataset released. In addition to empirical evaluation of progress based on benchmark datasets, that is susceptible to the limitations of said datasets, analytical work characterizing the properties of spatio-temporal descriptors for action was also auspicated. A detailed report of the findings of the workshop is available through NSF.