Sampling from a given distribution from a space with many attributes is a fundamental problem in computer science. Over the past two decades, practical applications of sampling have proliferated in areas such as statistics, networking, biology, differential privacy, and, most notably, machine learning. Sampling is used to evaluate models, as a subroutine for optimization, and more generally for exploring large complex spaces. In these practical settings, the time complexity of sampling is a severe limitation; known methods often require either restricting sampling to very small instances or resorting to unproven heuristics or overly restrictive assumptions. This project will develop a toolkit for sampling and evaluate it on real data sets --- a large-scale, high-dimensional toolkit for sampling smooth and non-smooth distributions, and a suite of functions that can be computed or estimated using access to samples. It will be developed working together with domain experts in health metrics and systems biology.

The overall goal of the project is to produce a general-purpose, open-source, and publicly accessible software for sampling non-smooth log-concave distributions with millions of variables. Achieving these goals requires overcoming complex challenges in both theory and implementation. The new algorithms will be inspired by the investigators' expertise in convex optimization, high dimensional geometry, and randomized linear algebra, especially their breakthroughs in linear programming and volume computation. In both target application domains, health metrics and systems biology, the investigators have worked with experts to develop the current state-of-the-art software tools and deployed them. Drawing from this experience, they are poised to both develop general tools and make data-driven discoveries in these domains.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
1839323
Program Officer
A. Funda Ergun
Project Start
Project End
Budget Start
2018-10-01
Budget End
2021-09-30
Support Year
Fiscal Year
2018
Total Cost
$120,000
Indirect Cost
Name
Georgia Tech Research Corporation
Department
Type
DUNS #
City
Atlanta
State
GA
Country
United States
Zip Code
30332