This project builds upon a previously funded middleware development effort by the Research in Advanced Distributed Cyberinfrastructure and Applications Laboratory (RADICAL) at Rutgers University; the previously developed middleware building blocks were known as RADICAL-Cybertools (RCT). In the current project, the team pursues a targeted set of developments, driven by the need to scale the number of software components, user, and supported platforms; and improve performance, engineering processes, and sustainability. The resulting capabilities will serve scientific applications in multiple domains, including software engineering, chemical physics, materials science, health science, climate science, drug discovery and particle physics.
This project builds upon a prior prototype investment, which developed a pilot system for leadership-class HPC machines, and a Python implementation of SAGA, a distributed computing standard. The current effort is organized around three activities: - Extending RCT functionality to reliably support a range of novel applications at scale (examples include tightly coupling traditional HPC simulations with machine-learning methods); - Enhancing RCT to be ready to support new NSF systems, such as the Frontera supercomputing system and other new systems; - Prototyping a new component: a campaign manager for computational resource management. Data-driven approaches will be used to improve software development, engineering, and life-cycle management, and to enhance the long-term sustainability of RCT and the supported communities. The project includes use cases that are representative examples of the growing community that RCT engages and supports, such as the ATLAS high-energy physics project and the QCArchive project enabling large-scale force-field construction and physical property prediction.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.