With the decline of Moore's Law, computer systems increasingly rely on specialized hardware, or accelerators to improve performance, scalability and energy efficiency. Data centers are projected to consume between 9% and 20% of the world's total power by 2025, so improved performance and efficiency are important to the health of our planet. Field Programmable Gate Arrays, or FPGAs, are reconfigurable hardware with flexibility to become any specialized hardware on demand. They are compelling in modern data centers because they enable a wide range of accelerators to be synthesized from a single reconfigurable device; providers such as Amazon and Microsoft now support FPGA acceleration in the cloud. However, the paradigm shift to reconfigurable hardware is massive for traditional operating systems, which are designed to manage fixed-function devices. Current systems assume a single application has exclusive access to the hardware, and the lack of sharing support limits utilization and efficiency.

The goal of this research is to develop system-level abstractions and end-to-end operating system (OS) support for protected FPGA sharing that improves utilization and throughput by an order of magnitude or more. This is challenging because FPGA accelerators are ephemeral and malleable: they may change frequently and be able to scale with increased shares of FPGA resource. These properties run counter to decades-old OS design assumptions. The project aims for a complete open-source stack that supports efficient high-utilization protected sharing for server processes to offload computations to FPGAs, and will extend it to function as a compatibility layer for cloud FPGA platforms. Measurements of an initial prototype called AmorphOS show that the approach has promise and can improve utilization and throughput by 8X and 32X for Microsoft Catapult and Amazon F1 systems running neural network (DNN) inference. The system has the possibility to be used pervasively in data centers supporting FPGAs, and the results of the work could guide the design of the next generation of operating systems for heterogeneous and reconfigurable hardware.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Network Systems (CNS)
Application #
1846169
Program Officer
Marilyn McClure
Project Start
Project End
Budget Start
2019-06-01
Budget End
2024-05-31
Support Year
Fiscal Year
2018
Total Cost
$323,865
Indirect Cost
Name
University of Texas Austin
Department
Type
DUNS #
City
Austin
State
TX
Country
United States
Zip Code
78759