Data Stream computation is emerging as one of the important models for analysis, monitoring and event detection in context of large networks, e.g., the internet, telephone networks, and sensor networks for observational sciences. In general, data streams model computations over massive data sets. Often, the size of the data is so large that it incurs too high a cost (in time or infrastructure) to allow efficient random access.
Thus in a data stream algorithm the space allowed to an algorithm is significantly smaller than the size of the input. In several scenarios, the input data is not even stored elsewhere, allowing only a single pass over the data.
Due to the space bounds, data stream computation typically cannot seek exact solutions and therefore consider approximation algorithms to find near optimal solutions. Furthermore, the applications of data streams, i.e., monitoring and analysis, often prefer a fast near optimal solution compared to an expensive polynomial time exact solution.
The goal of this proposal is to design approximation algorithms for several key problems in data streams. The proposal will also attempt to develop broad techniques and frameworks for stream algorithms. Along with our theoretical investigations, we will also explore problems related to practice. Given the relevance of the data streams to large networks the proposed algorithmic research in data streams will have a broad impact.