Learning from big data has been revolutionizing inference and decision-making, and yet several important applications fall in the small data regime. In this regime, obtaining training samples can be expensive, slow or even hazardous. Thus, there is a critical need for enabling sequential decision-making and inference as data from sequential samples is received over time. This project develops new methods to improve the efficiency and accuracy of sequential decision-making and inference. It will demonstrate the impact of expected outcomes via rigorous and targeted evaluation in applications such as content recommendation systems, clinical trials, distributed machine learning, and hyperparameter tuning. The research outcomes will be published to broad academic and professional audiences and incorporated into teaching curricula via graduate and undergraduate courses. The project will encourage a diverse group of students to participate in research. Through industry partnerships, outcomes of this research will be transitioned quickly to practice.

Multi-armed bandit algorithms, which aim to maximize the cumulative reward or identify the best option among a set of choices (referred to as arms), are naturally suited for problems involving sequential decision-making. However, most of the work on multi-armed bandit algorithms assumes independence of the rewards across arms. The objective of the proposed project is to exploit known latent structures and correlation between arms to drastically reduce the sample complexity of multi-armed bandit algorithms. In particular, the investigators aim to design sample-efficient algorithms for two different frameworks: i) the structured bandit framework, where the rewards depend on a common latent feature vector, and ii) a novel correlated bandit framework where reward realizations from arms are correlated with each other. In both frameworks, the project will result in the design of algorithms to maximize the cumulative reward (the exploration-exploitation problem) and to identify the best action/arm as fast as possible (the pure exploration problem). This will be done through a novel and easily generalizable approach to account for the available information on the structure or the correlation among arms to boost the performance of decision-making and inference.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Project Start
Project End
Budget Start
2020-10-01
Budget End
2023-09-30
Support Year
Fiscal Year
2020
Total Cost
$500,000
Indirect Cost
Name
Carnegie-Mellon University
Department
Type
DUNS #
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213