Images and video are analyzed in terms of parts. Qualitatively, a part of an image is a region that can be extracted consistently, even if the image changes somewhat because of differences in quality, resolution, lighting, viewpoint, or small variations in the shape of the objects being depicted. In video, these regions are extruded into time, forming tubes of sorts that persist over relatively long time intervals.
Technically, parts are regions with high saliency and stability, two notions that are defined anew in this research based on mathematical tools that span from harmonic functions to new developments in computational topology and spectral graph theory.
Parts are a key handle into image and video structure, as they allow describing the visual information succinctly and in a stable manner. They lead to indices for retrieval, and provide primitives that make it possible for computer software to recognize objects and activities. The main result from this effort is a systematic comparison of advantages and limitations of the new definition with descriptors from the literature.
Applications of part-based visual analysis range from image retrieval, medical and biological imaging, and video interpretation for military and intelligence scenarios, to surveillance, the annotation and editing of images and video clips, and more. Work involves graduate students and undergraduates funded through the NSF REU program. Results of this research are disseminated through scholarly publications and classes at Duke University. A benchmark of evaluation images and video is developed for open use by the research community.