Many critical workloads today, such as web-hosted services, are limited not by raw CPU processing power but by interactions between the CPU cores, the memory system, I/O devices such as disks and network interfaces, and the complex software (applications, middleware, operating systems, virtual machines) that ties all these components together. To improve the efficiency of these workloads and systems, designers and developers need tools to identify the bottlenecks, so that they can address them. However, existing performance analysis tools, such as software profilers, cannot account for hardware bottlenecks or for situations where software overheads are hidden due to overlap with other operations.
As computer systems become ever more complex networked aggregations of software and hardware from multiple vendors, the ability to isolate and address inefficiencies that reduce throughput and waste energy is daunting. The goal of the project is to address this problem by developing an analysis methodology and tool set that identifies true bottlenecks in complex systems spanning multiple software and hardware layers executing concurrently across multiple CPU cores and dedicated hardware devices. This research utilizes critical-path analysis to not only identify bottlenecks but also quantify their contribution and estimate the speedup obtainable if a particular set of bottlenecks is removed or reduced. Ultimately, this research will lead to qualitative performance improvements in software and hardware system designs as the methodology and tools produced aid designers and developers in focusing their efforts on removing the true bottlenecks.
The project involves both graduate and undergraduate students as researchers. The results will be fed into appropriate courses, including one on parallel architectures. The material and slides for this course will be made available over the web.