CIF:Small:Information-theoretic and Computational Thresholds in Statistical Learning

Montanari, Andrea

Abstract

Advanced algorithms are an increasingly powerful tool to extract information from vast amount of data that are gathered over the Internet, by smartphones, sensors networks, or high-throughput scientific studies. As these methods become ubiquitous, it is crucial to understand their full potential. What kind of information can we hope to extract from a certain type of data? Viceversa, how much data should we accumulate in order to be able to infer a certain piece of information? What is the bottleneck that prevents us from extracting more information? These questions have been studied within classical statistics, but modern applications pose entirely new challenges and classical concepts are only partially useful. In particular, computational resources become a crucial bottleneck for modern datasets. In many cases, although the data contain in principle the information of interest, finding it is a needle-in-haystack problem, and cannot be done on human timescales. This project aims at characterizing these fundamental limitations in several central problems, and develop algorithms that can achieve those limits.

Both information theory and complexity theory fall short of capturing the fundamental limitations to statistical learning tasks. This project follows a different approach which aims at analyzing broad classes of algorithms, and draw connections between their behavior. More precisely, the project considers three such classes that essentially encompass most algorithms used nowadays: empirical risk minimization; semidefinite programming hierarchies; and local algorithms. The focus is on two concrete statistical estimation problems that are relevant for a number of applications: group synchronization on graphs; non-linear high-dimensional regression and classification. In these and analogous problems, the behavior of seemingly different types of algorithms is often surprisingly similar. Understanding the origin of this similarity and its implications is a key focus of this research.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Computer and Communication Foundations (CCF)
Type: Standard Grant (Standard)
Application #: 1714305
Program Officer: Phillip Regalia

Project Start
Project End
Budget Start: 2017-07-01
Budget End: 2021-06-30
Support Year
Fiscal Year: 2017
Total Cost: $450,000
Indirect Cost

CIF:Small:Information-theoretic and Computational Thresholds in Statistical Learning
Montanari, Andrea
Stanford University, Stanford, CA, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments