Much of today's computational capability is housed in massive cloud computing infrastructures sometimes known as "Warehouse Scale Computers." This transition has given rise to a new class of emerging applications that run on hundreds of thousands of powerful cores, and access petabytes or exabytes of storage; these applications include web search, media streaming, big data analysis, etc. This centralization of the world's computing means that inefficiencies in those systems are magnified to a high degree -- in other words, if we can improve the efficiency of those systems, we measurably improve the efficiency (improve performance, reduce energy drain) of the world's computing infrastructure. This research addresses several sources of inefficiency, including increasingly inaccurate assumptions of hardware homogeneity, unpredictable interference between applications, and poor models of low-level resource sharing.
This research addresses these inefficiencies by (1) creating a heterogeneity-aware execution framework for cloud platforms that not only accounts for the heterogeneous capabilities of the hardware that are expected to increase over time, but intentionally employs heterogeneity (at multiple levels) to improve efficiency in running diverse workloads; (2) creating a holistic runtime system for shared resource management that accounts for resource sharing at all levels, including low-level sharing on CMPs and multithreaded cores, allowing threads to be more aggressively co-scheduled; and (3) creating new precise prediction models for performance and quality of service that can drive more intelligent scheduling decisions.