A crucial part of any data-driven decision-making problem under uncertainty involves being able to guarantee, with a high degree of confidence, a desirable performance of estimated decision rules when actually deployed in practice. This, precisely, is the key role of the guarantees given by statistical inference. The goal of this project is to investigate a novel inference methodology that precisely builds from inception data-driven decision-making rules with enhanced out-of-sample properties. This is achieved by introducing a game-theoretic formulation, in which the decision maker optimizes against an adversary that optimally exploits potential weaknesses of a decision when adding perturbations to the data (within reasonable size, yet arbitrary directions). A statistical framework is designed to estimate an optimal amount of data perturbations to obtain robust, yet practical, decision rules. The framework naturally leads to optimal (in certain sense) confidence regions for the underlying decision making parameters. The output of this project has implications in various areas of applied decision making under uncertainty, in particular, machine learning, artificial intelligence, operations management, and transportation are applications of special interest. The graduate student will work on computational methods for large scale optimal transport.

The novel inference methodology to be investigated in this project unifies and extends a large class of estimators (such as generalized Lasso and regularized logistic regression among many others), which have been successfully applied in practice. These are encompassed within a distributionally robust optimization (DRO) framework. A DRO formulation is a class of games in which the statistician chooses a parameter or an action to minimize certain expected loss and an adversary chooses a perturbation of the empirical measure against the statistician (maximizing the expected loss) within a certain size (called the distributional uncertainty size). In the context of this proposal, this perturbation is measured in terms of optimal transport costs (or Wasserstein distances). The use of the Wasserstein distance and the DRO formulation justifies the name Robust Wasserstein Profile Inference of this inference methodology. The proposal studies the optimal selection of the distributional uncertainty size and associated optimal confidence regions induced by the DRO formulation. Specific applications, for example, to shape constrained estimation and stochastic optimization problem in engineering will be studied by the PI. The proposed research provides a rich interplay between the theory of optimal transport, statistical inference and convex optimization. Finally, the PI will attempt to recruit high-quality personnel from under-represented groups. The PI will also disseminate the scientific output of this proposal via open access sites, in addition to the standard vehicles (conferences and journal publications).

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
1915967
Program Officer
Gabor Szekely
Project Start
Project End
Budget Start
2019-07-01
Budget End
2022-06-30
Support Year
Fiscal Year
2019
Total Cost
$166,667
Indirect Cost
Name
Stanford University
Department
Type
DUNS #
City
Stanford
State
CA
Country
United States
Zip Code
94305