Robot vision systems should be fast, to enhance the reaction times of robots to events in the visual world, capable of solving multiple vision problems simultaneously, and aware of their limitations. These properties are critical for robotic safety and collaboration. Safety is enhanced by faster reaction times (e.g. a car faster to detect obstacles has more room to stop before hitting them) and self-awareness (e.g., a robot should choose to stop to operate in situations that it deems too hard to be successful in). Collaboration is enhanced by scalability (which allows co-robots to solve more problems and thus behave more like human collaborators) and self-awareness (which simplifies the division of tasks between humans and robots, or teams of robots, with different skills). However, these properties have not been the focus of computer vision research, which has mostly addressed the design of networks that solve single tasks, usually requiring heavy computation and relatively low frame rates, and simply attempt to process all examples without any consideration for how difficult they are. This project addresses all these challenges, laying the foundation for a new generation of robotic perception systems that are more efficient, scalable, and self-aware. The research has applicability in areas of societal relevance, such as manufacturing, self-driving vehicles, intelligent systems, assisted living, homeland security, etc. Educationally, the project will provide exciting opportunities for both graduate and undergraduate research.
This project pursues a research agenda composed of several integrated contributions that advance the state of the art in deep learning for robotic vison. This includes 1) novel neural network quantization techniques that address the quantization of both network weights and activations, leading to deep learning models that can be fully implemented with binary operations, significantly enhancing the speed of all AI computations; 2) new families of networks that exploit extensive parameter sharing to achieve scalable inference for task ecologies, substantially increasing the number of networks that can be cached in a processor and, therefore, the number of vision problems that can be solved simultaneously by a robot; 3) new network architectures for self-aware deep learning, capable of assessing the difficulty of each example, predicting failures, and refusing to process examples that are too difficult, so as to mitigate the possibility of catastrophic errors.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.