This project's goal is to acquire and develop an instrumented datacenter testbed spanning the three sites of the NSF Center for Autonomic Computing (CAC)-the University of Florida (UF), the University of Arizona (UA) and Rutgers, the State University of New Jersey (RU). Datacenters are a growing component of society's IT infrastructure, including services related to health, banking, commerce, defense, education and entertainment. Annual energy and administration costs of today's datacenters amount to billions of dollars; high energy consumption also translates into excessive heat dissipation, which, in turn, increases cooling costs and increases servers' failure rates. The proposed testbed will enable a fundamental understanding of the operations of data centers and the autonomic control and management of their resources and services. The design of the underlying infrastructure reflects the natural heterogeneity, dynamism and distribution of real-world datacenters, and includes embedded instrumentation at all levels, including the platform, virtualization, middleware and application layers. Its scale and geographical distribution enables studies of challenges faced by datacenter applications, services, middleware and architectures related to both "scale-up" (increases in the capacity of individual servers) and "scale-out" (increases in the number of servers in the system). This testbed will enable fundamental and far-reaching research focused on cross-layer autonomics for managing and optimizing large-scale datacenters. The participant sites will contribute complementary expertise-UA at the resource level, UF at the virtualization layer, and RU in the area of services and applications. The collaboration between the university sites will bring coherence across ongoing separate research efforts and have a transformative impact on the modeling, formulation and solution of datacenter management problems, which have so far been considered mostly in terms of individual layers. The testbed will also provide a critical infrastructure for education at multiple levels, including providing students with hands-on experience via course projects, enable development of new advanced multi-university and cross-disciplinary courses, as well as multi-site group projects focused on end-to-end autonomics, which will use the proposed testbed. Students from underrepresented groups will be actively involved in the research and their participation will be increased through ongoing collaborations with minority institutions. Even broader community participation will result from an evolving partnership with the recently proposed industry cloud initiatives.

Project Report

With the rapid growth of data centers and clouds, the power cost and power consumption of their computing and storage resources become critically important to be managed efficiently. Several research studies have shown that data servers typically operate at a low utilization of 10% to 15%, while their power consumption is close to those at peak loads. With this significant fluctuation in the workloads, an elastic delivery of computing services with an efficient power provisioning mechanism becomes an important design goal. Live workload migrations and virtualization are important techniques to optimize power and performance in large-scale data centers. This project presents an application specific autonomic adaptive power and performance management system that utilizes AppFlow-based reasoning to configure dynamically datacenter resources and workload allocations. This system will continuously monitor the workload to determine the current operating point of both workloads and the virtual machines (VMs) running these workloads and then predict the next operating points for these VMs. This enables the system to allocate the appropriate amount of hardware resources that can run efficiently the VM workloads with minimum power consumption. We have experimented with and evaluated our approach to manage the VMs running RUBiS bidding application. Our experimental results showed that our approach can reduce the VMs’ power consumption up to 84% compared to static resource allocation and up to 30% compared to other methods with minimum performance degradation. The growing scales of enterprise computing environments and data centers have made issues related to power consumption, air conditioning, and cooling infrastructures of critical concern in terms of the growing operating costs of both power and cooling. Furthermore, power and cooling rates are increasing by an alarming 8 fold every year and are becoming the dominant part of IT budgets. Clearly, the impact of this project is a crucial issue for enterprise data centers beyond science and technology.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Network Systems (CNS)
Application #
0855087
Program Officer
Theodore Baker
Project Start
Project End
Budget Start
2009-10-01
Budget End
2012-09-30
Support Year
Fiscal Year
2008
Total Cost
$210,000
Indirect Cost
Name
University of Arizona
Department
Type
DUNS #
City
Tucson
State
AZ
Country
United States
Zip Code
85721