Dongarra, Jack J. Plank, James S. University of Tennessee

Enabling Technology for High-Performance Heterogeneous Clusters Focusing on Grid Middleware, Fault Tolerance and Sparse Matrix Computations

This research instrumentation enables research projects in:

- Harnessing Cluster Resources for Distributed Scientific Computing, - Fast and Portable Checkpointing in Clusters and Clusters-of-Clusters, and - Tools for Large-Scale Sparse Matrix Applications on Clusters.

To support the aforementioned projects, this award contributes to the purchase of a 32-node high performance cluster, some switches, workstations, and interface to existing clusters and visualization lab at the University of Tennessee. High-performance, low-latency clusters assembled from commodity computers and interconnects are clearly the cost-effective alternative to parallel supercomputers. But the software environment on such clusters is primitive at best; there is an obvious need of tools for application development and cluster management. The three projects in this proposal address this need. Their common goal is the development of enabling technology for advanced scientific computing applications on large-scale clusters and heterogeneous clusters-of-clusters. The proposed instrumentation is for a high-performance networked cluster of workstations. This cluster will be connected to two small clusters (available in the department) to provide a sizable, heterogeneous ``cluster-of-clusters'' with visualization capabilities. This cluster-of-clusters parallel platform will be used for algorithm, software and tool development research in the three projects.

Project Start
Project End
Budget Start
1999-03-01
Budget End
2002-02-28
Support Year
Fiscal Year
1998
Total Cost
$150,000
Indirect Cost
Name
University of Tennessee Knoxville
Department
Type
DUNS #
City
Knoxville
State
TN
Country
United States
Zip Code
37996