Computers are increasingly making decisions that impact on each and every one of us, in scenarios as wide-ranging as deciding credit card limits, flying aircraft, predicting weather and recommending products via the internet. The algorithms which perform these tasks are complex and their intricacies cannot be fully appreciated by all the people who rely on them. The goal of this project is to equip these decision-making algorithms with measures of uncertainty so that, where appropriate, human or further computer intervention can be used to ensure fair and safe outcomes. The junior researchers involved in the program will also engage in outreach programs, organized through Cal Tech. These outreach programs are aimed at high school students, and designed in particular to impact on a diverse range of high school students; this outreach work will be enhanced by experience with the research about equipping everyday algorithms with measures of uncertainty.

The purpose of this project is to obtain a deeper understanding of machine learning algorithms. This will be achieved by formulating and solving the problems in a statistical fashion in which uncertainty in both the mathematical models used for learning, and the data used to train them, is tracked and quantified. The objectives are twofold: (i) to improve existing algorithms by allowing them to be cognizant of their own uncertainties, or by allowing humans to interact with them in an informed fashion; (ii) to use knowledge of uncertainties to study the predictive power of the algorithms and identify laws or rules implicitly encoded within them. A Bayesian formulation of a number of machine learning tasks will be adopted, with focus on neural networks, and related issues arising in graph-based semi-supervised learning. Recent advances in the development of Monte Carlo Markov chain (MCMC) samplers in high dimensions will be deployed to make empirical studies of uncertainty. Various parameter limits (including large data volume, data in high dimensional spaces, and small data noise) will be used to develop mathematical theories which quantify uncertainty in the predictions made by machine learning algorithms.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
1818977
Program Officer
Leland Jameson
Project Start
Project End
Budget Start
2018-06-15
Budget End
2021-05-31
Support Year
Fiscal Year
2018
Total Cost
$250,000
Indirect Cost
Name
California Institute of Technology
Department
Type
DUNS #
City
Pasadena
State
CA
Country
United States
Zip Code
91125