Data centers and cluster computing platforms have become the dominant computational paradigms of the past decade, becoming the de facto method of executing Big Data workloads. Typically, the entire cluster is treated as a set of resources shared by multiple clients who submit jobs requiring different types of resources. To add to this heterogeneity (or dimensionality) in resource requirements, machines in a cluster can be heterogeneous in terms of resources they provide. In such settings, resources need to be allocated and priced appropriately so as to balance performance with demand.

In this project, the PIs seek to study job scheduling in data centers to optimize temporal Quality of Service metrics such as response time along with fairness, when jobs can have resource requirements that are multi-dimensional. The main question studied can be phrased as: How does the temporal nature of job scheduling interplay with the dimensionality of resource requirements, and how do these two in turn interact with the classical economic desiderata of incentives and fairness?

This project is differentiated from previous work in aiming to develop appropriate models and algorithms through the lens of theoretical computer science, particularly by a fusion of the disparate fields of approximation and online algorithms, algorithmic game theory, and stochastic optimization. The resulting insights will be used to also develop new techniques to address classical scheduling and game theoretic problems that have defied successful solutions. The project is interdisciplinary, and the theoretical models and techniques developed will be motivated by the application domain of new hardware architectures stemming from emerging technologies, and the heterogeneity arising from provisioning them within a data center. Further, empirical validations will be performed, both via simulation on traces from data center executions, as well as deployment and experiments on clusters. This will ultimately influence the design and deployment of internet systems that use and generate massive data.

The interdisciplinary nature of the project points to not only the need for training a pipeline of students from high school students to graduates and imparting to them the power of algorithmic thinking and its broader relevance, but also the necessity for bringing scientists, mathematicians, and system builders to the same platform for active exchange of ideas. Towards this end, the PIs seek to equip the next generation of students, including women and minorities, with the relevant algorithmic skills by an education plan that includes effective teaching and mentorship, as well as to broadly disseminate the proposed work by organizing workshops and by writing books and surveys.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
1409130
Program Officer
Tracy Kimbrel
Project Start
Project End
Budget Start
2014-08-01
Budget End
2019-07-31
Support Year
Fiscal Year
2014
Total Cost
$397,007
Indirect Cost
Name
University of California - Merced
Department
Type
DUNS #
City
Merced
State
CA
Country
United States
Zip Code
95343