Various technological trends, such as faster networks, cheaper data storage, and ubiquitous data logging, have given us access to massive amounts of data. This gives rise to two fundamental questions that need to be addressed if we are to exploit this data: (a) How to process such data? Traditional models of computation and notions of efficiency need to be reconsidered when monitoring Gbps network traffic, mining petabytes of search engine data, or processing data that is distributed across multiple low-power sensors. (b) What to compute about such data? Often the data that is quickest to accumulate is data that is noisy, plagued by internal inconsistencies, or redundant. How can useful information be extracted from such data?

Over the last decade, the study of sketching (a form of compression based on linear projection) and stream computation (space-bounded computation where the input is processed sequentially) has sought to address aspects of the above questions. The research goal of this project is to initiate and pursue a variety of new directions for these computational models. These include (a) Developing a more systematic understanding of computation in the existing models by seeking broad characterizations of problem tractability and developing "super synopses" that solve entire families of related problems. (b) Extending and tailoring existing models in order to address a wider range of applications such as processing stochastically generated data. (c) Establishing a general and intellectually intriguing abstraction of the challenges of computing with massive data sets that subsumes sketching and stream computation.

In conjunction with these research goals, the project includes various educational and broader impact initiatives that are designed to ensure a wide dissemination of research results and to train graduate and undergraduate students.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
0953754
Program Officer
Balasubramanian Kalyanasundaram
Project Start
Project End
Budget Start
2010-04-01
Budget End
2015-03-31
Support Year
Fiscal Year
2009
Total Cost
$539,634
Indirect Cost
Name
University of Massachusetts Amherst
Department
Type
DUNS #
City
Amherst
State
MA
Country
United States
Zip Code
01003