In order to handle large-scale problems, many algorithms have been proposed for improving the training speed of machine learning models. However, in many real world applications the bottleneck is at the prediction phase instead of the training phase due to the time and space complexity of prediction. Unlike the training phase that can run for several hours on multiple machines, the prediction phase usually runs on real-time systems; as a result, each prediction has to be done in a few seconds in order to provide immediate feedback to users. Furthermore, applications that run on mobile devices have even more strict constraints on memory capacity and computational resources. To address these issues, this research develops a new family of machine learning algorithms with faster prediction time and smaller model size. The outcome of this project creates a fundamental shift in the applicability of machine learning models to real-time online systems and on-device applications. Software packages and experimental platforms are made available to the public after being tested on applications. Besides the research objectives, the PI also pursues educational objectives including promoting undergraduate research, involving under-represented minorities in science and engineering, and developing undergraduate and graduate data science curriculums.

The goal of this project is to develop novel approaches for reducing prediction time and model size of machine learning algorithms. In particular, the project focuses on machine learning applications with large output space (matrix factorization, extreme multi-class/multi-label classification), and highly nonlinear models (kernel methods and deep neural networks). A series of approximation algorithms are studied, including tree-based algorithms, clustering approaches, and sub-linear time search algorithms. A unified framework is developed for these algorithms and the trade-off between accuracy and prediction time/model size is studied both in theory and in practice. The proposed algorithms are evaluated on a broad range of real world applications, including online web services and on-device applications.

Project Start
Project End
Budget Start
2018-08-13
Budget End
2021-07-31
Support Year
Fiscal Year
2019
Total Cost
$362,846
Indirect Cost
Name
University of California Los Angeles
Department
Type
DUNS #
City
Los Angeles
State
CA
Country
United States
Zip Code
90095