III: Small: Persistent Data Summaries: Temporal Analytics on Big Data Histories

Phillips, Jeff; Li, Feifei; Li, Feifei; Phillips, Jeff

Abstract

An increasing number of applications require the storage of and access to all historical data to support rich analytics, learning, and mining operations. This project develops a series of methods to summarize data so that it can be queried with respect to not just the full data set, as is standard, but with respect to the state of the data set at any historical time. These summaries integrate with large temporal databases, in both offline batched-processing and online streaming application scenarios. The effectiveness of these methods will be demonstrated on an enormous scientific database of atmospheric data collected for 20 years from over 40,000 weather stations. We will work with industry collaborators to help deploy our new algorithms, and the results will be integrated into education and outreach efforts surrounding the growth of data science initiatives.

More specifically, this project extends and combines approximate query processing with temporal big data. In particular, instead of (or on top of) using a multi-version database, this project designs and implements persistent data summaries (PDSs) that offer interactive temporal analytics with strong theoretical guarantees on their approximation quality. In additional to formalizing these models, this project develops practical PDS implementations for sampling-based summaries, data sketches, and core sets that support advanced analytical queries.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Type: Standard Grant (Standard)
Application #: 1816149
Program Officer: Hector Munoz-Avila

Project Start
Project End
Budget Start: 2018-09-01
Budget End: 2021-08-31
Support Year
Fiscal Year: 2018
Total Cost: $499,934
Indirect Cost

III: Small: Persistent Data Summaries: Temporal Analytics on Big Data Histories
Phillips, Jeff Li, Feifei Li, Feifei Phillips, Jeff
University of Utah, Salt Lake City, UT, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments