The objective of this research is to develop general tools for reducing the effective size of massive datasets and to demonstrate the capabilities of these tools by applying them to the efficiently represent images and streaming video. The approach is based on a new class of randomized algorithms that have proven capable of accelerating key scientific computations, while simultaneously making the algorithms easier to implement on multi-core computers.

With respect to intellectual merit, this project addresses the fundamental scientific question of how to extract information from large and noisy datasets. These datasets are created by many branches of society in quantities and at a rate that make the analysis of the data extremely difficult with state-of-the-art methods. This research is based on the belief that the scale of the problem can be reduced with new algorithms that discover the underlying structure of a dataset by randomly examining chunks of data. Formally, mathematical results on random projections in high-dimensional spaces will be turned into fast algorithms that build low-dimensional representations amenable to knowledge discovery.

With respect to broader impacts, this project has the potential to create a new generation of algorithms for organizing and searching huge databases of images and videos. It could pave the way for novel algorithms to manage the massive datasets created by web searches, biomedical research, cyber security, and other domains. The investigators are mentoring students from the existing "SMART" program, which attracts outstanding students from underrepresented groups into graduate school. Undergraduate students are involved via the existing "Mentoring Through Critical Transition Points" program.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
0941476
Program Officer
Junping Wang
Project Start
Project End
Budget Start
2009-09-01
Budget End
2013-08-31
Support Year
Fiscal Year
2009
Total Cost
$535,784
Indirect Cost
Name
University of Colorado at Boulder
Department
Type
DUNS #
City
Boulder
State
CO
Country
United States
Zip Code
80309