This award supports the development of Cello, a new parallel adaptive mesh refinement (AMR) software framework. The purpose of Cello is to enable researchers to write multi-physics AMR applications that can harness the enormous computing power of current and future world-class high- performance computing (HPC) platforms. The distinguishing characteristic relative to existing AMR frameworks is the aggressive pursuit at the onset of extreme scalability, both in terms of software data structures and hardware parallelism. Integral to developing Cello will be developing Enzo-P, a petascale astrophysics and cosmology application built on top of the Cello framework. Enzo-P will not only help drive development of the underlying Cello framework, but it will serve as a highly scalable variant of the Enzo terascale astrophysics and cosmology community code. Both the Enzo-P science application, and the underlying independent Cello parallel AMR software framework, will be released and supported as community software.
Software sustainability will be realized under the dual support of the Laboratory for Computational Astrophysics (LCA) at the University of California San Diego, and the San Diego Supercomputer Center, organizations devoted to the long-term maintenance of---and user support for---community scientific codes and HPC cyberinfrastructure. Software self-management will be an integral component of the Cello software design, with software resilience a high priority. The Cello framework will be designed to detect hardware and software faults, identify performance and numerical issues, and dynamically reconfigure to always perform with the highest possible efficiency on currently available hardware components. Energy efficiency can be considered implicit in the underlying adaptive mesh refinement approach, which dynamically targets computational resources where they are required, and avoids expending resources where they are not. The adaptivity of AMR translates directly to energy savings. This project synthesizes the best practices of existing parallel AMR frameworks and adds several innovations that improve the quality of the mesh for various types of simulations. A hierarchical approach to parallelism and load balancing is taken while enforcing locality to the maximum while relaxing global synchronization to a minimum. This will enable the construction of AMR applications of unprecedented spatial and temporal dynamic range on tomorrow's hierarchical HPC architectures. Adaptive mesh refinement has proven to be a powerful numerical technique in a wide variety of disciplines in the pure and applied sciences and engineering. Existing frameworks will probably not be scalable to tens and hundreds of millions of cores, meaning that existing applications built on them will need to move to more scalable and fault tolerant frameworks. Cello is being built with these applications in mind. The broader impacts of this software development will come through the science and engineering applications that Cello supports, as well as novel design principles it embodies.
The purpose of the grant was to develop new scientific software capable of simulating complex astrophysical and cosmological phenomena on the current high end of supercomputers--so called petascale computers. Such supercomputers have more than a hundred thousand processers. Existing simulation software must be rewritten from scratch to run on such systems. A key requirement is called parallel scalability, meaning that the computation must run efficiently on this many processors, or more. The next generation of supercomputers will have as many as 100 million processors. We designed, implemented, and tested the Enzo-P/Cello software with this goal in mind. We did this in two pieces: the Cello software infrastructure, which implements the most scalable algorithm known for hierarchical adaptive meshing--essential for astrophysical and cosmological applications, and the Enzo-P application software, which solves the relevant physics equations building on the Cello infrastructure. Both codes use object-oriented design methods for scalability, readibility, supportability, and extensibility. Parallel computing is provided by interfacing Cello to the CHARM++ parallel objects system developed by the Parallel Computing Laboratory at the University of Illinois. We have proved the potential of this new software by running scalability tests on the NCSA Blue Waters Sustained Petaflop supercomputer at the U. Illinois. We avhieved almost perfect scaling on a fully adaptive gas dynamics simulation on 64,000 processor cores, which is as far as we have pushed it. This simulaiton would have been completely impossible with our non-petascale version of the Enzo code. Cello and Enzo-P have been released to the public as open source software. Cello is a completely general purpose adaptive mesh infrastructure software library; it can be adopted by other application communities for their needs. The Enzo-P/Cello combination provides new capabilities to a large and already existing developer/user community for the non-petascale version of the Enzo code. A key outcome of this project was the successful award of a follow-on NSF grant to work with the existing Enzo developer/user community to expand the capabilities of the Enzo-P/Cello code base. This will allow the community to attack previously intractable problems like the formation of star and galaxy clusters.