RI: Medium: Collaborative Research: Write A Classifier: Learning Fine-Grained Visual Classifiers from Text and Images

Muresan, Smaranda

Abstract

This project develops the learning strategy using textual narrative and images makes the learning effective without a huge number of images that a typical visual learning algorithm would need to learn the class boundaries. The research team investigates computational models for joint learning of visual concepts from images and textual descriptions of fine-grained categories, for example, discriminating between bird species. The research activities have broader impact in three fields: computer vision, natural language processing, and machine learning. There is a huge need to develop algorithms to automatically understand the content of images and videos, with numerous potential applications in web searches, image and video archival and retrieval, surveillance applications, robot navigation and others. There are various applications for developing an intelligent system that can use narrative to define and recognize categories.

This project addresses two research questions: First, given a visual corpus and a textual corpus about a specific domain, how to jointly and effectively learn visual concepts? Second, given these two modalities how to facilitate learning novel visual concepts using only pure textual descriptions of novel categories in the domain? The research team approaches the problem on three integrated fronts: Learning, Natural Language Processing (NLP), and Computer Vision. On the learning front, the project investigates and develops algorithms suitable for learning and predicting visual classifiers with side textual information. On the NLP front, the project aims to develop novel methods for learning global and local discriminative category-level attributes and their values from text, with feedback from human computation and visual signal. The project investigates supervised and unsupervised methods for detecting visual text, and learning methods for deep language understanding to build such rich domain models from the noisy visual text. On the Vision front, the project addresses the tasks of detection and classification with side textual information. The project investigates models for the shape and appearance of a general category that can specialize to different subordinates, in a way that allows interpreting information from text within a proper geometric context, and handle variability in viewpoints and articulation.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Application #: 1409257
Program Officer: Jie Yang

Project Start
Project End
Budget Start: 2014-06-15
Budget End: 2019-05-31
Support Year
Fiscal Year: 2014
Total Cost: $463,208
Indirect Cost

RI: Medium: Collaborative Research: Write A Classifier: Learning Fine-Grained Visual Classifiers from Text and Images
Muresan, Smaranda
Columbia University, New York, NY, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments