This project develops mathematical and computational approaches for big data exploitation. Fast and online algorithms that learn and adapt as data arrives and changes are developed. How to automatically understand and reduce redundancy in the data, for a given task, is also addressed in this project. Big data comes in multiple forms, e.g., audio and video, audio and text, video and weather, video from multiple sources, brain imaging from multiple modalities, friendship networks and individual preferences. This is also addressed in this project. The broad impact of the research is born in the large and diverse applicability of big data and in the techniques here developed. In the education arena, the developed Internet classes have an audience of tens of thousands, and the project provides unique integration of research and undergraduate education via different Duke initiatives.

The framework follows the parsimony theory of sparse modeling. Challenges are addressed with a gamechanging paradigm: learning to optimize; on-line learning what the task-dependent optimizer is expected to do, developing computationally efficient algorithms to approximate the ideal behavior of sometimes unknown optimizers. The work derives novel multi-modal formulations for network inference, and realtime on-line robust PCA and robust NMF, fundamental tools in big data modeling and exploitation; as well as robust 3D shape, networks, and multi-modal matching. The formulation elegantly solves bilevel optimization problems rendering it efficient for classification and signal separation tasks. Sparse modeling is extended to new venues and algorithms, making such techniques usable for big data. The formulations and theoretical foundations are complemented with numerous applications.

Project Start
Project End
Budget Start
2013-09-01
Budget End
2017-08-31
Support Year
Fiscal Year
2013
Total Cost
$367,001
Indirect Cost
Name
Duke University
Department
Type
DUNS #
City
Durham
State
NC
Country
United States
Zip Code
27705