The goal of this project is to characterize several important problems in active learning from a theoretical perspective. Active learning is a kind of machine learning, a key aspect of Robust Intelligence. A central aim of machine learning is to develop techniques that construct models of data in order to help make predictions in future situations. The past decades have seen huge advances in machine learning that uses labeled data. However, labels are often difficult to obtain. Active learning addresses situations in which the data are unlabeled, and any labels must be explicitly requested and paid for. The aim of active learning is to learn a good classifier with as few labels as possible. Despite its practical importance, active learning is a comparatively underdeveloped area in machine learning.
This project will rigorously investigate the potential of intelligent querying, and develop practical, label-efficient learning algorithms. It will bring together a diversity of student talent, from theoreticians to domain experts in biology and vision applications. The resulting algorithms will be made widely available, and have the potential to increase the applicability of machine learning to the many large-scale problems in which difficulty of labeling is a critical bottleneck.