CAREER: The optimal use of data

Duchi, John

Abstract

Modern techniques for data gathering?arising from medicine and bioinformatics, internet applications such as web-search, physics and astronomy, mobile data gathering platforms?have yielded an explosion in the mass and diversity of data. Concurrently, statistics, decision theory, and machine learning have successfully laid a groundwork for answering questions about our world based on analysis of this data. As more information is collected, classical approaches for inference and learning are insufficient, as additional concerns arise?computational resources, privacy considerations, storage limitations, network communication constraints? outside of statistical accuracy. This prompts a basic question: how can multiple criteria be balanced while maintaining statistical performance?

To bring statistics and machine learning into closer contact with other desiderata, this research involves the development of procedures that trade between scarce resources in principled and optimal ways. Such trade-offs have been difficult to characterize, as current tools for providing fundamental limits (such as information theory in communication) do not connect disparate areas. Three concrete sub-areas serve as bases for this research. The investigators study the interplay of computing with learning, estimation, and optimization by connecting notions of computation?such as memory accesses or synchronization in distributed systems?to data analysis tasks. Second, the research investigates adaptive and robust procedures?and associated statistical costs?that will become more important given increasingly long-tailed and messy data. Thirdly, the investigators study privacy in estimation, using information and decision-theoretic tools to characterize the tensions between statistical accuracy and sensitive data disclosures. Combined, these lay the groundwork for a theory on the use of data in the face of constraints, along with a functional and practical understanding of procedures that balance scarce resources against statistical accuracy.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Computer and Communication Foundations (CCF)
Application #: 1553086
Program Officer: Phillip Regalia

Project Start
Project End
Budget Start: 2016-02-15
Budget End: 2021-01-31
Support Year
Fiscal Year: 2015
Total Cost: $497,033
Indirect Cost

CAREER: The optimal use of data
Duchi, John
Stanford University, Stanford, CA, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments