Rapid development of large-scale data collection technology has ignited research into high-dimensional machine learning. For instance, the problem of designing recommender systems, such as those used by Amazon, Netflix and other on-line companies, involves analyzing large matrices that describe users' behavior in past situations. In sociology, researchers are interested in fitting networks to large-scale data sets, involving hundreds or thousands of individuals. In medical imaging, the goal is to reconstruct complicated phenomena (e.g., brain images; videos of a beating heart) based on a minimal number of incomplete and possibly corrupted measurements. Motivated by such applications, the goal of this research is to develop and analyze models and algorithms for extracting relevant structure from such high-dimensional data sets in a robust and scalable fashion.

The research leverages tools from convex optimization, signal processing, and robust statistics. It consists of three main thrusts: (1) Model restrictiveness: Successful methods for high-dimensional data exploit low-dimensional structure; however, many real-world problems fall outside the scope of existing models. This proposal significantly extends the basic set-up by allowing for multiple structures, leading to computationally efficient algorithms while eliminating negative effects of model mismatch. (2) Non-ideal data: Missing data are prevalent in real-world problems, and can cause major breakdowns in standard algorithms for high-dimensional data. The second thrust devises relaxations and greedy approaches for these non-convex problems. (3) Arbitrary Outliers: Gross errors can arise for various reasons, including fault-prone sensors and manipulative agents. The third thrust proposes efficient and randomized algorithms to address arbitrary outliers.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
1302435
Program Officer
Phillip Regalia
Project Start
Project End
Budget Start
2013-07-01
Budget End
2019-06-30
Support Year
Fiscal Year
2013
Total Cost
$695,369
Indirect Cost
Name
University of Texas Austin
Department
Type
DUNS #
City
Austin
State
TX
Country
United States
Zip Code
78759