Datacenters are a critical infrastructure to support modern Internet services like search, social networking and e-commerce. Rack-scale computers are emerging to fundamentally change how datacenters are designed, built and managed. Rack-scale computers disaggregate resources in each rack of servers into separate pools and organize them at the rack level. Such resource disaggregation can enable fine-grained resource allocation and increase resource utilization. Resource management is essential for rack-scale computers to realize fully these benefits. Yet, the densely-packed resources and the rise of millisecond-scale and microsecond-scale tasks pose unprecedented requirements on the throughput and latency for the resource manager. Today's server-based solutions fall short to meet these requirements. This project investigates a new architecture that leverages the power and flexibility of new-generation programmable switches for resource management in rack-scale computers. This project explores the boundary of in-network computing. While networks are traditionally designed for packet forwarding, this project exploits the capability of new-generation programmable switches to realize application-level functionalities that go beyond traditional packet processing. This project uses in-network resource management to exemplify how networks and systems can be deeply integrated and co-designed for next-generation rack-scale computers.

This project will not only improve resource management for rack-scale computers in practice, but also provide new architectural and theoretical insights on computer system designs and principles. While the new architecture directly benefits from switch hardware for high performance, the core challenge is to realize generic resource management policies with limited switch functionalities and resources. To address this challenge, this project will exploit compact data structures to efficiently store resource consumption and utilization information with minimal switch resources, and leverage randomized and approximation algorithms to design light-weight mechanisms for the switch data plane to make near-optimal resource management decisions. A prototype system will be implemented with commodity servers and switches and evaluated with microbenchmarks and end-to-end system experiments under a wide range of workloads. This project will provide extensive research, training and educational opportunities for both undergraduate and graduate students, and actively engage women and minorities, with research projects and new course materials. This project will release open-source software for other researchers to leverage and reproduce the results and for the industry to adopt the solutions in practice.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Network Systems (CNS)
Type
Standard Grant (Standard)
Application #
1813487
Program Officer
Darleen Fisher
Project Start
Project End
Budget Start
2018-10-01
Budget End
2021-09-30
Support Year
Fiscal Year
2018
Total Cost
$499,998
Indirect Cost
Name
Johns Hopkins University
Department
Type
DUNS #
City
Baltimore
State
MD
Country
United States
Zip Code
21218