This project is built on the interdisciplinary collaboration between multimedia and natural language processing (NLP) researchers. The goal of this exploratory research is to provide effective methods for organizing, searching, mining and reasoning with web-scale multimedia. The approach is based on formulation of a structured multimedia database, called Multimedia Information Networks (MINets) which enables new searching and mining paradigms such as keyword query of unlabeled image/video databases, query expansion in content-based image retrieval (CBIR), and measuring similarity between different modalities such as an image and a piece of text). In addition, MINets framework is expected to provide the ability to perform robust inference (e.g., recognizing objects and activities in images or videos) in the presence of noise and uncertainty. In its simplest form, a MINet is a graph where nodes are either concepts (text) or data (such as images), and links are ontological/semantic relationships between concepts, attachment of images to concepts, and visual similarity measures between images. Construction of an experimental MINets framework involves crawling the Web for a particular domain, gathering images with associated text and exploiting natural language processing, computer vision and mining techniques in establishing the concepts, associated images, interconnecting links and an ontology that supports inference. Validation against current web search engines and CBIR techniques is expected to provide a proof-of-concept for the novel MINets framework.

This interdisciplinary exploratory project is expected to yield general theoretical and algorithmic MINets framework that will provide new searching, mining and reasoning capabilities for multimedia data. It will help to define new research areas in effective utilization of multimedia information sources for cross-media and cross-conceptual knowledge discovery and analysis, large-scale annotation, information fusion and inference. Project results, including open source software, annotated corpora, scoring metrics will be disseminated via project Website (https://netfiles.uiuc.edu/qi4/www/MINets.htm). This project will provide research opportunities for graduate and undergraduate students.

Project Report

In this project, we developed a large-scale database system, called Multimedia Information Networks (MINets), to structure the web-scale multimedia content on the World Wide Web and social media websites like Facebook and Twitter. Specifically, we constructed a set of hierarchical semantic concepts to automatically classify the messages, posts and images shared by users. This can help build a database system to support efficient search into huge volume of multimedia content. As a prototype system, we built an experimental platform to collect and analyze the tweets about the natural disasters, such as 2011 Irene hurricane and T?hoku earthquake and tsunami, including all the texts as well as the external images and videos linked to these tweets. Our system is able to recognize if the linked images and videos are truly relevant to the tweet posts or they are linked by mistake or maliciously faked by measuring cross-media similarity. By linking these cross-media content together, we then developed a hierarchical concept set, called ontology, as a universal language to index the multimodal data in the database. The set of concepts are carefully defined in a natural language (e.g., English) by removing the conceptual ambiguity. For example, the concept "Apple" has distinct meanings in different contexts – either a kind of fruit or an IT company name. Our system is aware of the potential natural language ambiguity and able to differentiate them in different contexts with the help of the learned ontology. In this way, when a user proposes a query, the system will attempt to accurately understand the users’ real intention and recommend the most exact content to users. All these technologies are driven by our fundamental discovery and invention to build up web-scale MINets, which intelligently connect the cross-media content and use the ontology to accurately understand the multimedia content. These discovery and invention have potential to build better information retrieval systems when we wish to find the desired multimedia content from billions of websites and tweet posts.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
1144111
Program Officer
Maria Zemankova
Project Start
Project End
Budget Start
2011-08-01
Budget End
2013-07-31
Support Year
Fiscal Year
2011
Total Cost
$199,360
Indirect Cost
Name
University of Illinois Urbana-Champaign
Department
Type
DUNS #
City
Champaign
State
IL
Country
United States
Zip Code
61820