Due to the uncertainty, incompleteness and inconsistency from automatic extraction processes, query results from current large-scale knowledge bases (KBs) are incomplete, erroneous and conflicting. The research objective of this proposal is to extend the state-of-the-art KB systems to create a probabilistic first-order KB system that can infer missing knowledge using rules, prune conflicting knowledge using constraints, and return confidence values for resulting tuples. The new system and algorithms developed in this proposal can enable advanced online data analysis through an declarative query interface over large uncertain graphs exist in many high impact applications, including knowledge bases, social networks, and biological networks.

The research objective of this proposal is to extend the data model, query language, query processing and optimization techniques of the state-of-the-art KB systems to support a probabilistic first-order KB system. The P.I. will design a probabilistic KB graph data model; extend SPARQL to probabilistic graph query language with additional inference operators; invent new query execution and optimization techniques for scalable inference queries; and implement a new query processing system using a unified data-parallel and graph-parallel system over web-scale probabilistic KB graphs.

For further information see the project website at: http://dsr.cise.ufl.edu/Eureka

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
1526753
Program Officer
Sylvia Spengler
Project Start
Project End
Budget Start
2015-09-01
Budget End
2021-08-31
Support Year
Fiscal Year
2015
Total Cost
$499,917
Indirect Cost
Name
University of Florida
Department
Type
DUNS #
City
Gainesville
State
FL
Country
United States
Zip Code
32611