Accurate predictions are key to effective decision making under uncertainty. Psychology research has shown that predictive judgments can be improved by considering the outside view: placing a problem in the context of similar historical cases, rather than focusing on its unique features. But choosing the right comparison is difficult: statisticians have studied the so-called reference class problem since at least the 19th century. The main objective of this project is to assess the performance of a new method for crowdsourcing reference-class judgments and producing probability forecasts, relative to new and established machine learning models. The method, called human forests, promotes outside-view thinking by enabling forecasters to construct reference classes from a database of historical cases. The human forests method shares a conceptual connection with random forest machine models. In both, predictions are based on frequencies assessed in classification trees. While random forest models use training data to build the trees, human forests rely on forecaster' collective knowledge. The project will examine the relative strengths of both methods and explore combinations of the two. We will also assess methods for improving the accuracy of individual forecasters.

The intellectual merit of the proposal resides in its promise to address the reference class problem through collective intelligence. The project will compare the accuracy of human forests, complemented with metacognitive training and statistical aggregation techniques, with that of random forest models, and a human-machine hybrid approach. The latter will use bi-level optimization, providing an advancement in the use of optimization in machine learning, with the aim of pushing the frontier of both machine learning and human capabilities. The core randomized experiments will focus on clinical trial forecasting, namely, predicting the probability of advancement for cancer treatments. The study methods will utilize naturalistic, longitudinal, large-scale online experiments, and will compare the performance of subject-matter experts and generalists. The project will also provide training for researchers and students in machine learning and collective intelligence and develop materials for interactive exercises in high-school STEM classes, undergraduate and graduate courses in statistics and decision making. Assessing the relative importance of general forecasting skill versus subject matter expertise may help address skill scarcity problems in areas dependent exclusively on specialists. The research aims to improve the predictability of clinical trial outcomes and similarly complex activities. Accurate forecasts regarding the success of clinical trial programs may in turn improve risk management, resource allocation, and ultimately result in wider availability of life-saving treatments.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Social and Economic Sciences (SES)
Type
Standard Grant (Standard)
Application #
2050727
Program Officer
Claudia Gonzalez-Vallejo
Project Start
Project End
Budget Start
2020-08-01
Budget End
2022-02-28
Support Year
Fiscal Year
2020
Total Cost
$291,221
Indirect Cost
Name
American University
Department
Type
DUNS #
City
Washington
State
DC
Country
United States
Zip Code
20016