Data centers provide the computational and storage infrastructure required to meet today's ever increasing demand for Internet-based services. Web servers deliver a vast range of information on demand, ranging from static content such as files, images, video and audio streaming services, to dynamic content created via scripting languages (e.g,. PHP) or stand alone C/C++ applications (e.g., search results). Server performance, scaling and energy efficiency (throughput/Watt) are crucial factors in reducing total cost of ownership (TCO) in today's server-based industries. Unfortunately, current system designs based on commodity multicore processors may not be the most power/energy efficient for all server workloads.
Today's massively parallel accelerators (e.g., GPUs) provide exceptional performance per Watt for certain workloads versus conventional many core CPUs. Unfortunately, these devices have not found wide-spread general purpose use outside the high-performance computing domain. This project expands the use of massively parallel accelerators to server and operating system-intensive workloads by innovating across the application, runtime, operating system, and architecture layers.
This research builds on the observation that server workload execution patterns are not completely unique across multiple requests. The goal of this project is to develop computer systems (software and hardware) that exploit similarity across requests to improve server performance and power/energy efficiency by launching data parallel executions for request cohorts. The three primary aspects of this research are 1) mapping traditional thread/task parallel workloads onto data parallel hardware, 2) developing a new accelerator-centric operating system architecture, and 3) developing new architectural mechanisms to support this new class of accelerator workloads and operating system software.