Today an increasing number of scientists need reliable, extensible, large-scale and high-performance storage to be tightly integrated into their computing/analysis workflows. These researchers often require their data to be available not only on remote computational clusters but also on their specialized, in-laboratory, equipment, workstations and displays. Unfortunately, many are being hindered because they do not have, but need in their labs, the fundamental storage capacity and bandwidth this is often available only in specialized data centers. For example, leaders in biological research are revolutionizing their science to be data intensive by developing and using instruments like high field-strength electron microscopes that generate terabytes of data weekly. While the specialized parallel storage systems can be built today by experts to meet both the capacity and throughput needs of a computationally-intensive analysis on clusters, the level of administrative effort is enormous for both initial deployment and ongoing operation. For many scientists, their needs are rapidly entering the realm of data intensive but their access to capable and reliable storage is limited either because of the complexity or expense (or both) of existing solutions. This same limit, which the Rocks cluster toolkit has successfully addressed, existed in computational clusters a decade ago.

In this award, the established and widely-used Rocks clustering software toolkit will be expanded to include not only ongoing production support and engineering enhancements for computational clusters but also to progressively address a litany of issues directly related to clustered storage provisioning, monitoring, and event generation. In particular, the impact of this award will be to bring the the current simplicity of compute cluster deployment to also include (1) farms of network-attached file servers and (2) dedicated parallel, high-performance storage clusters through the standard Rocks extension mechanism called Rolls. In addtion, the development of a monitoring architecture targeted specifically at storage subsystems to include per-disk metrics, file-system metrics, and aggregated network utilization for both Lustre Parallel and NFS-based server farms use will be started. The investigators will start the design and development of mechanisms that will enable correlation of file server utilization with jobs running on clients and remote workstations.

Project Report

The Rocks cluster toolkit began as a project to make it easier for domain scientists to deploy and maintain commodity clusters in their laboratories without requiring significant administrative effort. Prior to tools like Rocks, researchers would typically spend days, weeks, and sometimes months installing and configuring software to build a functioning compute or Beowulf cluster. With Rocks, a complete, functional, up-to-date cluster can be built from "bare metal" (no pre-installed operating system) in just a few hours. Based upon the CentOS distribution (but is compatible with nearly any Redhat-derived distribution), Rocks enables users to easily construct customizable, robust, high-performance clusters. Even though the roots of the toolkit are in high-performance computing (HPC), Rocks can and is used to build tiled-display walls, high-performance storage systems, and virtualized computing (cloud) hosting. The intellectual merit of this project is to investigate efficient, robust, and reproducible methods for building a a variety of clustered systems. The end-result has been a series of public, open-source software releases that increase both the flexibility and functionality of clustered systems. Rocks is used around the world and at US production-level leading computing resource providers like Pacific Northwest National Laboratory, the San Diego Supercomputer Center, and the Texas Advanced Computing Center. It is used at many research and teaching universities throughout the US including University of Nevada, Reno, University of Memphis, Rutgers University, University of Wisconsin, Florida Atlantic University, Clark Atlanta Univerisity and the University of Connecticut. Small and large companies also deploy clusters using Rocks. The reach of the software toolkit as computing infrastructure means that it impacts every scientist who utilizes Rocks-managed computing and data, even if they are unaware that their favorite computing system is built with this toolkit. The toolkit tames the inherent complexity of building and configuring clustered data systems. A full build of the released software creates over 300 packages that complement the standard single-node operating system installation. Support for MPI (Message Passing Interface), Infiniband, several load schedulers, upgraded versions of common scripting languages like Python and Perl, bioinformatics tools, advanced storage (e.g. the ZFS file system), virtualization, creation of EC2-compatible cloud images, web servers, and cluster monitoring are all optional extensions supported by the basic toolkit. A fundamental design and intellectual goal was user-extensibility via the Rolls mechanism that enables anyone to develop new packages, and more importantly, the configuration needed for automatic deployment in a cluster configuration. The toolkit uses an extensible "graph" to define a particular cluster where rolls provide well-defined subgraphs. This approach enables software like load schedulers (e.g., HTCondor or Torque) to easily define server and client configurations. Rolls, automates this configuration so that end-users deploy a basic working configuration that can then be further customized to meet needs. Rolls are standalone components of cluster functionality that can be mixed and matched to create the specifc configuration desired by the end-user simply by including the roll at the time of installation. Rocks itself is quite scalable with production deployments on large systems. Gordon (at the San Diego Supercomputer Center) is part of XSEDE and is 1024 nodes, and 16K cores. The PIC (Programatic Insitutional Computing) cluster at Pacific Northwest National Laboratories is over 20K cores. Many have developed Rolls that go beyond the core functionality of Rocks itself to build highly customized, but fully automated configurations. Developers within the NSF-funded XSEDE national computing infrastructure have independently created a Roll that incorporates the XSEDE software stack. The production computing group at SDSC has developed more than 40 rolls that are focused on a wide range of cluster-aware scientific applications, that group is actively migrating their rolls from a UCSD-owned software repository to GitHub for a wider impact. Rocks also has support for creating virtual clusters and mixes of physical and virtual hosts. While the "heavy lifting" of virtualization is done by the KVM (Kernel Virtual Machine) subsystem in CentOS, Rocks is able to treat virtual nodes as just another "brand" of hardware. Rocks is not image-based (though it can createAmazon EC2-compatible images), but instead uses a programmed approach to system configuration. By tightly integrating into the OS-supplied installer, Rocks is deployable on a vast array of hardware and supports heterogeneous hardware configurations with ease. A worldwide, active discussion list enables Rocks users to help each other or converse directly with developers. The combination of robust software and open-source have resulted in clusters being easily deployable and managable for a wide range of scientists and across numerous disciplines.

Agency
National Science Foundation (NSF)
Institute
Division of Advanced CyberInfrastructure (ACI)
Type
Standard Grant (Standard)
Application #
1032778
Program Officer
Daniel Katz
Project Start
Project End
Budget Start
2010-09-01
Budget End
2013-08-31
Support Year
Fiscal Year
2010
Total Cost
$499,999
Indirect Cost
Name
University of California San Diego
Department
Type
DUNS #
City
La Jolla
State
CA
Country
United States
Zip Code
92093