In modern signal processing, one is frequently faced with statistical inference problems involving massive datasets. For example, the experiments at the Large Hadron Collider at CERN generate hundreds of petabytes of data each year, which must be stored and processed efficiently in order to further our understanding of particle physics. Similar challenges also arise in seismic monitoring where massive amounts of data are acquired over large areas via cellphone accelerometers. Analyzing such large datasets is usually viewed as a substantial computational challenge. However, if data are a signal processor?s main resource then access to more data should be viewed as an asset rather than as a burden, and larger datasets should lead to a reduction in the runtime of data analysis algorithms.

This project blends concepts from computer science and from statistical signal processing to address the challenge with massive datasets by developing ?algorithm weakening? frameworks in which a data analysis procedure backs off to simpler methods as the data scale in size, leveraging the growing inferential strength of the data to ensure that a desired level of statistical accuracy is achieved with reduced runtime. The approach is concretely illustrated across a range of statistical estimation tasks, with convex relaxation techniques playing a prominent role as an algorithm weakening mechanism. In seeking a precise characterization of the computational and statistical tradeoffs obtained via convex relaxation, the investigator formalizes and studies new measures for characterizing the quality of approximation of one convex set by another. An interesting feature of this research is that convex relaxations which provide poor performance in combinatorial optimization problems may nonetheless yield useful solutions when employed in problems with inferential objectives.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
1350590
Program Officer
Phillip Regalia
Project Start
Project End
Budget Start
2014-02-01
Budget End
2020-01-31
Support Year
Fiscal Year
2013
Total Cost
$475,000
Indirect Cost
Name
California Institute of Technology
Department
Type
DUNS #
City
Pasadena
State
CA
Country
United States
Zip Code
91125