Data analysis is ubiquitous in a broad range of application fields, from computer graphics to geographic information systems, from sensor networks to social networks, from economics to medicine. It represents a fundamental problem in computational science. The project will advance the theoretical understanding of fundamental issues behind data analysis, and develop practical algorithms that will be useful for a broad range of problems in science and engineering.

The project addresses the fundamental problem of reconstructing structure of probability distributions from sampled data. It will investigate the use of tensor-based and other higher order methods, in particular those that allow for efficient optimization. The project lies at the interface of theoretical computer science, machine learning, signal processing and statistics and will have potential impact in all of these fields. In recent years there has been a resurgence of interest in tensor methods in data analysis and inference, particularly in theoretical computer science. These methods will prove useful in a variety of applications in machine learning, signal processing and other fields.

The project will develop algorithms for solving a range of problems including blind source separation, spectral clustering, inference in mixture models and estimating geometry of distributions. It will analyze the complexity of these and related problems. In particular, it will strive to understand the computational efficiency  and dependence on the dimension of the space, studying "the curses and blessings of dimensionality". It will also address a somewhat mysterious discrepancy between sample and algorithmic complexity in our understanding of many high dimensional inference problems.

The results of this work will be disseminated to the broad scientific community through publications in journals, conferences and presentations in various venues, including tutorials. The goals of this project include to implement the practical algorithms and to make the software available online. The research results will also be incorporated in the curriculum of graduate classes taught by the PI and the co-PI. Graduate students supported by this project will receive extensive training in theory, algorithm development and applications.

Project Start
Project End
Budget Start
2014-08-01
Budget End
2018-01-31
Support Year
Fiscal Year
2014
Total Cost
$450,000
Indirect Cost
Name
Ohio State University
Department
Type
DUNS #
City
Columbus
State
OH
Country
United States
Zip Code
43210