RI:Small: Online Maximization Algorithms for Streaming Data

Ying, Yiming; Lyu, Siwei

Abstract

Although existing machine learning (ML) methods are effective for analyzing data that are static in nature, many of today's applications from information retrieval to web searching require real-time processing of massive amounts of streaming data. Often, the data is available in an online fashion, meaning the entire dataset is not available at once, but rather, individual data instances arrive sequentially. Area under the Receiver Operating Characteristic Curve (AUC) has been proven to be an effective evaluation metric and learning objective in many application domains, but it is difficult to optimize directly. To date, there are no satisfactory approaches to incorporate AUC into online ML algorithms for classification and ranking problems. This project will address the fundamental theoretical and algorithmic challenges in ML algorithms based on AUC maximization for processing streaming, high-dimensional data with efficient algorithms and theoretical analysis. The results of this project are expected to have broader impact in intrusion detection for cyber-security, fault detection in safety critical systems, information retrieval and cancer diagnosis. The planned research will also integrate with educational activities, including developing new undergraduate/graduate courses on optimization and machine learning, organizing a workshop and making software tools freely available to the public. The principal investigators also plan to undertake outreach activities to improve STEM learning of high school students.

A central topic of this project is to develop efficient online learning algorithms for AUC maximization and bipartite ranking, making them amenable for online processing of high dimensional and large volume of streaming data. This project also establishes a rigorous statistical foundation and thorough theoretical analysis of the online AUC maximization algorithms developed. The primary technical challenge in developing online AUC maximization algorithms and theory is that the objective functions involve statistically dependent pairs of instances while individual instances continuously arrive in a sequential order. This is in contrast to standard classification based on accuracy, where the loss functions only depend on individual instances and the related algorithms and theory are well developed. The project will address this gap by sufficiently exploiting the structures of the population objective involved in the problem of AUC maximization, rather than an empirical objective on finite data, through novel interaction of machine learning, statistical theory and applied mathematics. This will help to remove the pairwise structure in the original objective function and facilitate the development of efficient online learning algorithms for AUC maximization.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Type: Standard Grant (Standard)
Application #: 1816227
Program Officer: Rebecca Hwa

Project Start
Project End
Budget Start: 2018-08-15
Budget End: 2022-07-31
Support Year
Fiscal Year: 2018
Total Cost: $498,333
Indirect Cost

RI:Small: Online Maximization Algorithms for Streaming Data
Ying, Yiming Lyu, Siwei
Suny at Albany, Albany, NY, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments