Contemporary machine learning techniques tend to be resource-intensive, often requiring good quality datasets, expensive hardware, or significant computing power. In a wide array of application domains, ranging from healthcare to mobile computing, these critical resources are lacking. Novel methodologies that enable the optimal utilization of resources can help unlock the full potential of the data science revolution for these domains. Towards this aim, this project will develop theoretically-grounded algorithms to facilitate the design of machine learning models under application-specific resource constraints. The outcomes of the project will help enable machine learning methods to operate with less human-annotated data, less computing power, and on a wider range of hardware platforms. To demonstrate interdisciplinary impact, the resulting algorithms will be employed in the design of efficient hydrological models which aid in predicting and managing water resources. The research will also be strongly coupled with education through the mentoring of undergraduate students, new undergraduate and graduate course development, and live broadcasts of the lectures over publicly accessible online platforms.

This project aims to develop the foundational theories and algorithms to guide the efficient use of statistical and computational resources. The research on the statistical front focuses on the data and will uncover the fundamental tradeoffs between the data amount, label quality, and the model accuracy. Understanding these tradeoffs will lead to the design of improved loss functions and regularization techniques. On the computational front, theory-inspired model compression schemes will be developed by exploring the interplay between the model size and accuracy. Secondly, the model performance will be enhanced by identifying the optimal model architecture via computationally-efficient algorithms that co-design the architecture, compression scheme, and the loss function. These theoretical and algorithmic investigations will utilize tools from statistical learning, optimization, deep learning theory, and high-dimensional probability. The proposed research is expected to provide much-needed theoretical basis for poorly-understood heuristics in fields spanning semi-supervised learning, model compression, neural architecture search, and will guide the design of next-generation algorithms achieving the optimal resource tradeoffs.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
2046816
Program Officer
Scott Acton
Project Start
Project End
Budget Start
2021-02-01
Budget End
2026-01-31
Support Year
Fiscal Year
2020
Total Cost
$219,515
Indirect Cost
Name
University of California Riverside
Department
Type
DUNS #
City
Riverside
State
CA
Country
United States
Zip Code
92521