This project exploits the benefits of RGB-D (color and depth) image collections with extra depth information to significantly advance the state-of-the-art in visual scene understanding, and makes computer vision techniques become usable in practical applications. Recent advance in affordable depth sensors has made depth acquisition significantly easier for ordinary users. These depth cameras are becoming very common in digital devices and help automatic scene understanding. The research team develops technologies to take advantage of depth information. Besides the published research results, the research team plans to distribute source code and benchmark data sets that could benefit researchers in a variety of disciplines. This project is integrated with educational programs, such as interdisciplinary workshops and courses at the graduate, undergraduate, and professional levels and diversity enhancement programs that promote opportunities for disadvantaged groups. The research team is closely collaborating with the industrial partner (Intel), involving interns and technology transfer in real products. The project is also applying the developed algorithms to the assistive technology for the blind and visually impaired.
This research develops algorithms required to perform real-time segmentation, labeling, and recognition of RGB-D images, videos, and 3D scans of indoor environments. Specifically, the PIs develop methods to: (1) acquire large labeled RGB-D datasets for training and evaluation, (2) study algorithms to recognize objects and estimate detailed 3D knowledge about the scene, (3) exploit the object-to-object contextual relationships in 3D, and (4) demonstrate applications to benefit the general public, including household robotics and assistive technologies for the blind.