Modern supercomputers are complex, hierarchical systems consisting of huge numbers of cores, systems for disk storage, and nodes for I/O forwarding. These numbers continue to grow and the need for tools to understand the behavior of the system software becomes paramount: without these tools it is impossible to effectively tune that software, and high degrees of efficiency is unattainable by applications. This project addresses the challenge of understanding the behavior of complex system software on very large-scale compute platforms, like the current petascale computers. In particular, this project is developing software infrastructure to provide end-to-end analysis and visualization of I/O system software. Specifically, the objectives are to develop, improve, and deploy (1) end-to-end, scalable tracing integrated into the I/O system (MPI-IO, I/O forwarding, and file system); (2) information visualization tools for inspecting traces and extracting knowledge; (3) testing components that drive this system to generate example patterns, including a component to generate anomalies; and (4) tutorials and tools for helping other system software developers incorporate this analysis and visualization system into their production software. The software and techniques developed in this project will be directly applicable to and useful in other system software libraries which perform complex interactions on large systems.
For further information see the project web site at the URL: http://vis.cs.ucdavis.edu/NSF/Jupiter
Modern supercomputers are complex, hierarchical systems consisting of huge numbers of cores, systems for disk storage, and nodes for I/O forwarding. These numbers continue to grow and the need for tools to understand the behavior of the system software becomes paramount: without proper tools it is impossible to effectively tune that software, and high degrees of efficiency is unattainable by applications. The objective of this project is to address the challenge of understanding the behavior of complex system software on very large-scale compute platforms, like the current petascale computers operated at the National Laboratories and Superomputing Centers. In particular, this project has focused on the development of interactive visualization solutions for inspecting large numbers of traces and extracting knowledge. Visualization transforms computing log data into pictures if properly designed can effectively summarize selected aspects of the data and help guide the overall analysis tasks. The resulting new set of tools can help both users and designers of high-end computing systems gain important insights into the complex interaction among the hardware, software systems, and applications. This project also led to a collaboration with industry, allowing the student researchers to participate in the development and evaluation of a large-scale production computing system.