III: Small: Collaborative Research: Generalizable Similarity and Proximity Metrics for Data Exploration

Winslett, Marianne

Abstract

Knowledge bases organize information into graphs of entities, and data exploration algorithms can leverage mathematical properties of these graphs to discover interesting and useful insights about the entities and their relationships. For example, data exploration algorithms can use the graph of Google Knowledge Base to identify people who have common interests, and can discover genes with similar behavior by analyzing the graph of the Genome Knowledge Base. Currently, data exploration tools tend to be quite sensitive to the details of how information is represented in these graphs, making the tools highly effective over some choices of representation but not so effective with others. As a result, data exploration has largely remained the province of experts and data scientists. This project seeks to overcome this dependency and enable a new generation of more general data exploration tools that ordinary users can use to explore data on their own, without an expert by their side.

More specifically, this project is creating effective similarity and proximity search algorithms that deliver the same results over various choices of representation for the underlying knowledge base. The key idea of the project is to use statistical metrics to quantify the degree of similarity between entities or patterns, in a manner that is not sensitive to the specific representation of the data. This novel theoretical framework serves as the foundation of more general data exploration algorithms, whose generality and effectiveness is being validated on large real-world knowledge bases.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Type: Standard Grant (Standard)
Application #: 1421247
Program Officer: Aidong Zhang

Project Start
Project End
Budget Start: 2014-09-01
Budget End: 2018-08-31
Support Year
Fiscal Year: 2014
Total Cost: $239,599
Indirect Cost

III: Small: Collaborative Research: Generalizable Similarity and Proximity Metrics for Data Exploration
Winslett, Marianne
University of Illinois Urbana-Champaign, Champaign, IL, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments