In modern data-intensive science and engineering, researchers are faced with estimating models where available observations are far fewer than the dimension of the model to be estimated. The key to the success of compressed sensing, matrix completion, and other problems of this type, is to properly exploit knowledge about the "structure" of the model. While structures such as sparsity have been separately studied, the problem of "simultaneous structures" has been neglected, since it is implicitly assumed by practitioners that simply combining known results for each structure would solve the joint problem. Interestingly, the PIs recently proved that this approach can result in a significant gap.

This proposal will develop theory and computationally tractable methods for estimating simultaneously structured models with minimal observations. It combines (1) a top-down approach to understand the fundamental limitations based on the geometry of how structures interact, and (2) a problem-specific, bottom-up approach to exploit domain knowledge in constructing appropriate penalties. This work addresses a variety of applications including (1) sparse principal component analysis, a central problem in statistics seeking approximate but sparse eigenvectors, (2) sparse phase retrieval and quadratic compressed sensing in signal processing, and (3) code design for communications and network coding.

The ability to systematically derive structured models from data will have far-reaching impact on engineering challenges in the era of Big Data and ubiquitous computing. Handling models with multiple structures poses deep theoretical and computational challenges that this proposal focuses on. Applications in machine learning, signal processing, and network coding are discussed. The PIs will incorporate research results in their teaching, organize technical workshops to bring together mathematicians and engineers, and seek the involvement of undergraduate students in this work through summer research programs.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
1409204
Program Officer
Phillip Regalia
Project Start
Project End
Budget Start
2014-08-15
Budget End
2018-07-31
Support Year
Fiscal Year
2014
Total Cost
$500,000
Indirect Cost
Name
California Institute of Technology
Department
Type
DUNS #
City
Pasadena
State
CA
Country
United States
Zip Code
91125