RI: Small: GraphLab 2: An Abstraction and System for Large-Scale Parallel Machine Learning on Natural Graphs

Guestrin, Carlos

Abstract

With the growth of the Web and improvements in data collection technology in Science, datasets have been rapidly increasing in size and complexity, necessitating comparable scaling of machine learning algorithms. However, designing and implementing efficient parallel machine learning algorithms is challenging and time consuming. To address this challenge, we recently released GraphLab, a framework providing an expressive and efficient high-level abstraction satisfying the needs of a broad range of machine learning algorithms. The performance of our system has attracted significant attention, receiving thousands of downloads from many universities and companies.

Currently, GraphLab only addresses batch processing in multicore settings. In this project, we are developing GraphLab 2: addressing the much more challenging online and distributed settings, tackling: 1) Cloud-based distributed machine learning. 2) Natural graphs, with very high-degree vertices that are not amenable to graph partitioning methods. 3) Online tasks, where data and queries are streaming over time. 4) Off-core computation, since huge problems may not fit into memory, even across the cloud.

One of the key contributions of the project is the continual dissemination and transfer of our technology. Our open-source software releases will continue to enable large-scale machine learning applications in science and engineering.

Our ambitious broader impact goals, beyond theory and systems, include the development of a new curriculum focused on preparing students for the industrial and scientific needs in this field. Our proposed courses include "Machine Learning on the Web" and "Cloud Computing for Big Machine Learning and Data Mining."

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Type: Standard Grant (Standard)
Application #: 1258741
Program Officer: Kenneth C. Whang

Project Start
Project End
Budget Start: 2012-08-01
Budget End: 2017-07-31
Support Year
Fiscal Year: 2012
Total Cost: $450,000
Indirect Cost

RI: Small: GraphLab 2: An Abstraction and System for Large-Scale Parallel Machine Learning on Natural Graphs
Guestrin, Carlos
University of Washington, Seattle, WA, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments