The internet provides an endless supply of images and videos, replete with weakly-annotated meta-data such as text tags, GPS coordinates, timestamps, or social media sentiments. This huge resource of visual data provides an opportunity to create scalable and powerful recognition algorithms that do not depend on expensive human annotations. The research component of this project develops novel visual scene understanding algorithms that can effectively learn from such weakly-annotated visual data. The main novelty is to combine both images and videos together. The developed algorithms could have broad impact in numerous fields including AI, security, and agricultural sciences. In addition to scientific impact, the project performs complementary educational and outreach activities. Specifically, it provides mentorship to high school, undergraduate, and graduate students, teaches new undergraduate and graduate computer vision courses that have been lacking at UC Davis, and organizes an international workshop on weakly-supervised visual scene understanding.
This project develops novel algorithms to advance weakly-supervised visual scene understanding in two complementary ways: (1) learning jointly with both images and videos to take advantage of their complementarity, and (2) learning from weak supervisory signals that go beyond standard semantic tags such as timestamps, captions, and relative comparisons. Specifically, it investigates novel approaches to advance tasks like fully-automatic video object segmentation, weakly-supervised object detection, unsupervised learning of object categories, and mining of localized patterns in the image/video data that are correlated with the weak supervisory signal. Throughout, the project explores ways to understand and mitigate noise in the weak labels and to overcome the domain differences between images and videos.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.