Machine learning is instrumental for the recent advances in AI and big data analysis. They have been used in almost every area of computer science and many fields of natural sciences, engineering, and social sciences. The central task of machine learning is to “train” a model, which entails seeking models that minimize certain performance metrics over a set of training examples. Such performance metrics are termed as the aggregate losses, which are to be distinguished from the individual losses that measures the quality of the model on a single training example. As the link between the training data and the model to be learned, the aggregate loss is a fundamental component in machine learning algorithms, and its theoretical and practical significance warrants a comprehensive and systematic study. The proposed work will focus on several fundamental research questions concerning the aggregate loss: are there any other types of aggregate loss beyond the average individual losses?; if so, what will be a general abstract formulation of these new aggregate loss?; how can the new aggregate losses be adapted to different machine learning problems?; and what are the statistical and computational behaviors of machine learning algorithms using the general aggregate losses?.

The technical aims of the project are divided into four interrelated thrusts. The first thrust explores new types of rank-based aggregate losses for binary classification and study efficient algorithms optimizing learning objectives formed based upon them. The new aggregate losses will be applied to problems such as object detection, where rank-based evaluation metric is used dominantly. The second thrust aims to deepen our understanding of the binary classification algorithms developed using the rank-based aggregate losses and will be focused on a study of their statistical theories such as generalization and consistency. The third thrust will extend the study of new types of aggregate losses to other supervised problems (multi-class and multi-label learning and supervised metric learning) and unsupervised learning. The fourth thrust dedicates to the theoretical aspects of aggregate losses, in which an aggregate loss will be abstracted as a set function that maps the ensemble of individual losses to a number. This abstraction will be exploited to study the properties of new aggregate losses that make them superior than the average loss and propose new aggregate losses beyond rank-based ones.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Project Start
Project End
Budget Start
2020-10-01
Budget End
2023-09-30
Support Year
Fiscal Year
2021
Total Cost
$449,985
Indirect Cost
Name
Suny at Buffalo
Department
Type
DUNS #
City
Buffalo
State
NY
Country
United States
Zip Code
14228