AF:Small: Divide-and-Conquer Numerical Methods for Analysis of Massive Data Sets

Dhillon, Inderjit

Abstract

Data is being generated at a tremendous rate in diverse applications, such as health care, genomics, energy management and social network analysis. Indeed, the recent moniker of Big Data emphasizes that massive volumes of data are ubiquitous. Thus, there is a great need for developing scalable and sophisticated methods for analyzing these data sets. This project is aimed towards one aspect of this challenge, namely, developing scalable and state-of-the-art numerical methods for modern problems that arise in machine learning.

This project will aim to develop divide-and-conquer methods for representative, concrete problems that arise in contemporary applications. These include (a) classification: kernel support vector machines, (b) regression: kernel regression and high-dimensional sparse approximation, (c) structure learning: graphical model estimation, (d) spectral approximation: multi-scale SVD computation, and (e) missing value estimation: matrix factorization. The project will develop specialized algorithms for each of these problems, in particular, developing tailored ways of dividing the problem into subproblems, solving the subproblems, and finally conquering the subproblems. Thus, general principles for applying the divide-and-conquer approach to other problems in large-scale machine learning will be uncovered. The project will lead to software for large-scale data analysis that will be efficient on modern multi-core computers. Impact of the new algorithms on various application areas, such as bioinformatics and network analysis, will be studied. Within computer science and applied mathematics, the project will have a broad impact on research in a variety of disciplines, including numerical analysis, numerical optimization, statistics, machine learning, data mining and parallel computing.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Computer and Communication Foundations (CCF)
Type: Standard Grant (Standard)
Application #: 1320746
Program Officer: Jack S. Snoeyink

Project Start
Project End
Budget Start: 2013-09-01
Budget End: 2017-08-31
Support Year
Fiscal Year: 2013
Total Cost: $491,044
Indirect Cost

AF:Small: Divide-and-Conquer Numerical Methods for Analysis of Massive Data Sets
Dhillon, Inderjit
University of Texas Austin, Austin, TX, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments