The proposed research largely lies in sparse optimization, a new distinct area of research in optimization for discovering sparse or other simple-structured solutions from dense datasets. Its development draws algorithmic techniques from classical nonlinear programming and is nurtured by the development in many other areas of data science. Today, the size, complexity, and diversity of instances have grown significantly. The proposed research addresses these new challenges in the following directions: data and variable splitting for handling multiple regularizers and for parallel and distributed optimization, efficient model path computation and regularization parameter selection, stochastic approximation, and coordinate descent methods for non-convex optimization. These investigations are expected to significantly reduce the running times of the existing algorithms, giving rise to novel algorithms to enable the solutions of a wide ranges of problems that are currently not solvable in data sciences. In particular, the expected results will fit machine learning models to data previously inaccessible (e.g., distributed data), enable the mining of data in much higher dimensions and across different modalities, as well as handle multiple regularizers in a computationally tractable way.

Technological advances in data gathering have led to a rapid proliferation of big data in diverse areas such as the Internet, engineering, climate studies, cosmology, and medicine. In order for this massive amount of data to make sense, new computational approaches are being introduced to let scientists and engineers analyze their data. Among these approaches, sparse optimization and structured solutions have grown enormously important. Today, their scopes are quickly expanding. Beyond the sensing and processing of 1D signals and 2D images, high-dimensional quantities such as 3D video, 4D CT, and multi-way tensors have become the data or unknown variables in models. Beyond the sparsity structure, structures such as low-rankness, sparse graph, tree structure, linear representation of a few dictionary atoms, as well as their combinations, have debut as desired structures in various applications including genome mapping, protein structure study, social network analysis, stock price prediction, and text/speech mining. The proposed research will build on the recent successes and lead to new techniques for handling large-sized, diverse-typed data and variables, novel algorithms for pursuing a variety of structures in solutions, the extension of existing numerical methods to parallel and decentralized computing architectures, and the contributions to solving key problems in several aforementioned application areas.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
1317602
Program Officer
Yong Zeng
Project Start
Project End
Budget Start
2013-09-01
Budget End
2016-08-31
Support Year
Fiscal Year
2013
Total Cost
$299,999
Indirect Cost
Name
University of California Los Angeles
Department
Type
DUNS #
City
Los Angeles
State
CA
Country
United States
Zip Code
90095