As an increasing amount of data, such as text, audio, and video, are collected from many sources, the need to better understand the collected data is increasing as well. The analysis of large datasets has evolved into complex artificial intelligence (AI) techniques in recent years. This is because the need has expanded from just a human analyzing the data to enabling a machine to make sense of it on its own. The overarching goal of this project is to enable such AI applications - specifically, deep learning - to become more efficient.

Graphics Processing Units (GPUs) are often used in AI applications like those mentioned above. This project addresses the fundamental limitations in resource management in modern GPUs. To this end, this project plans to take a holistic approach with three broad focus areas: (1) fine-grained sharing of individual GPUs; (2) coarse-grained sharing of a GPU cluster; and (3) dynamic readjustments to mitigate the impact of communication on distributed deep learning. The core techniques include temporal scheduling and spatial resource allocation with partial or no knowledge of job durations or workload characteristics. Algorithms designed as part of this project will have applications beyond simply running AI applications on GPU clusters.

Increasing GPU efficiency will help reduce the cost of using AI, leading to pervasive use of deep learning techniques. This will enable new applications of AI in emerging domains such as augmented/virtual reality and real-time interactive video analytics, while making them more cost-effective. The project includes plans to work with industry to translate the research into practice and to include its outcomes in graduate/undergraduate curricula. Lastly, it will build upon already-established outreach activities at the University of Michigan to help better convey the impact of AI on society to diverse student population groups and the general public.

All code and data generated and collected for this project, including software systems, simulators, and emulators, will be made available to the public as open-source resources at https://github.com/symbioticlab. They will be will be retained for at least the duration of the project.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Network Systems (CNS)
Type
Standard Grant (Standard)
Application #
1909067
Program Officer
Matt Mutka
Project Start
Project End
Budget Start
2019-10-01
Budget End
2022-09-30
Support Year
Fiscal Year
2019
Total Cost
$462,704
Indirect Cost
Name
Regents of the University of Michigan - Ann Arbor
Department
Type
DUNS #
City
Ann Arbor
State
MI
Country
United States
Zip Code
48109