Data Stream computation is emerging as one of the important models for analysis, monitoring and event detection in context of large networks, e.g., the internet, telephone networks, and sensor networks for observational sciences. In general, data streams model computations over massive data sets. Often, the size of the data is so large that it incurs too high a cost (in time or infrastructure) to allow efficient random access.

Thus in a data stream algorithm the space allowed to an algorithm is significantly smaller than the size of the input. In several scenarios, the input data is not even stored elsewhere, allowing only a single pass over the data.

Due to the space bounds, data stream computation typically cannot seek exact solutions and therefore consider approximation algorithms to find near optimal solutions. Furthermore, the applications of data streams, i.e., monitoring and analysis, often prefer a fast near optimal solution compared to an expensive polynomial time exact solution.

The goal of this proposal is to design approximation algorithms for several key problems in data streams. The proposal will also attempt to develop broad techniques and frameworks for stream algorithms. Along with our theoretical investigations, we will also explore problems related to practice. Given the relevance of the data streams to large networks the proposed algorithmic research in data streams will have a broad impact.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
0430376
Program Officer
Tracy J. Kimbrel
Project Start
Project End
Budget Start
2004-09-01
Budget End
2008-08-31
Support Year
Fiscal Year
2004
Total Cost
$200,000
Indirect Cost
Name
University of Pennsylvania
Department
Type
DUNS #
City
Philadelphia
State
PA
Country
United States
Zip Code
19104