The starting point for this proposal is a view of scientific simulation articulated in the conclusions of the 2008 National Academy of Sciences Study, The Potential Impact of High-End Capability Computing on Four Illustrative Fields of Science and Engineering: "Advanced computational science and engineering is a complex enterprise that requires models, algorithms, software, hardware, facilities, education and training, and a community of researchers attuned to its special needs." (p. 122)
Over the last few years, the design of computer and software systems, particularly as they relate to simulation in the physical sciences, has been organized around a collection of algorithmic patterns / motifs. These patterns have been very productive because they are a natural "common language" in which application scientists can express their computations, and for which computer scientists can provide optimized libraries, domain specific languages, compilers, and other software tools.
This project will design an institute focused on a subset of these patterns --- structured grid discretizations of partial differential equations and particle methods, along with the linear and nonlinear solvers that enable their effective use --- with the specific goals of providing simulation capabilities for a set of scientific domains that make heavy use of these patterns. Two major components are envisioned to this proposed institute, called the Institute for High-Performance Computational Science with Structured Meshes and Particles (HPCS-SMP). The first component is a software infrastructure development activity that will be performed by a team whose expertise spans the design and development of mathematical algorithms and software frameworks, as well as the design and development of compilers, runtime systems, and tools that enable one to obtain high performance from emerging multicore and heterogeneous architectures. The second component is an outreach activity, in which algorithms, libraries, and software frameworks developed by the institute will be customized and integrated into simulation codes for stakeholder application domains. At the heart of this activity will be collaborations and partnerships, in which the institute will provide one or more software developers to collaborate with application scientists over a period of months to years to develop a new simulation capability or enhance an existing one.
The design of this institute will be carried out through a series of workshops, each focused on one of five stakeholder science domains that have been identified as using these motifs and that play a central role in various NSF Grand Challenge problems, with participation of both representatives of the science domain and the the relevant mathematics and computer science communities. In addition, there will be a final workshop that will bring together the relevant mathematics and computer science experts to identify cross-cutting themes. These information obtained from these workshops will be used by the project to develop the final conceptual design of the institute, in the form of a document that includes the input from all of the workshops and our analysis of how this leads to a design of a software institute.
Scientific applications developers have been confronted with a transition from single-core processors with homogeneous memory to processors with multiple cores, heterogeneous performance, and complex multilevel memory hierarchies with non-uniform behavior. These changes have reached a point where the current approaches to developing scientific simulation software are no longer viable for producing efficient codes for next generation parallel systems. Our proposed solution is to develop a shared software infrastructure to support multiple scientific domains that possess a common mathematical structure in their models. We focus on two algorithmic motifs: structured grids (including adaptive mesh refinement) and particles. Our approach to determining the feasibility of such an effort was to hold a series of workshops in five scientific areas of importance to NSF that use these motifs -- combustion for engineering systems, astrophysics and cosmology, plasma physics and kinetics, spatial modeling in biology, and climate modeling. Across these areas, we found a sufficiently large common set of requirements that would justify the development of a common software infrastructure. Findings include the need to design new discretization algorithms that are better suited for the next generation of processors, and the development of a software stack based on Domain-Specific Languages (DSLs) that are extensions to widely-used programming languages. Such a software stack would raise the semantic level of the user programs to that of the mathematical description of the discretizations, thus enabling platform--specific optimizations to be performed by compilers, rather than in user code. This effort could be could be executed by a software institute consisting of three components of roughly equal level of effort and funding. One is the development of the DSL-based software stack, including compiler and runtime support, as well as productivity tools to support debugging and performance analysis. Another would the development of mathematical libraries and frameworks using this software stack that would support the new discretization algorithms required to obtain high performance on the next generation of platforms. Finally, the institute would include several activities devoted to the dissemination of the software: the development of general training materials -- user guides, example applications, short courses -- to bring new users up to speed on how to use this software effectively; and the execution of targeted applications campaigns that would develop new scientific simulation capabilities in collaboration with various scientific communities.