Data are increasingly generated, stored, and processed distributively. Meanwhile, when large amounts of data are generated, ambiguity, uncertainty, and errors are inherently introduced, especially in a distributed setup. It is best to represent such data in a distributed probabilistic database. In distributed data management, summary queries are useful tools for obtaining the most important answers from massive quantities of data effectively and efficiently, e.g., top-k queries, heavy hitters (aka frequent items), histograms and wavelets, threshold monitoring queries, etc. This project investigates novel query processing techniques for various, important summary queries in distributed probabilistic data.

Broadly classified, this project examines both snapshot summary queries in static (i.e., no updates) distributed probabilistic databases, and continuous summary queries in dynamic (i.e., with updates) distributed probabilistic databases. A number of techniques are explored to design novel, communication and computation efficient algorithms for processing these queries.

A distributed probabilistic data management system (DPDMS) prototype is implemented based on the query processing techniques developed in this project. This DPDMS is released to and used in practice by scientists and engineers from other science disciplines as well as industry.

Graduate and undergraduate students, including those from minority groups, are actively involved in this project. Findings from the project have been integrated into different courses, demos, and educational projects. For further information, such as publications, data sets, source code, and education initiatives, please visit the project website at www.cs.fsu.edu/~lifeifei/dpdm.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
1200792
Program Officer
Maria Zemankova
Project Start
Project End
Budget Start
2011-09-29
Budget End
2017-08-31
Support Year
Fiscal Year
2012
Total Cost
$529,356
Indirect Cost
Name
University of Utah
Department
Type
DUNS #
City
Salt Lake City
State
UT
Country
United States
Zip Code
84112