Human society is faced with an increasing number of problems including stubborn diseases and international security/climate threats. The computer simulations and advanced data management methods necessary to solving these societal problems can only be realized through increased use of parallel computing at all system scales, including desktops, servers and the cloud. Efficient large-scale parallel computing however requires advanced parallel programming methods. Such methods, unfortunately, have a greater proclivity for software bugs that increase cost through lost cycles on super-computers and these same bugs undermine confidence in simulation results. This research addresses the challenge of developing parallel computing software by creating new scalable methods to support advanced parallel programming models that provide rigorous guarantees on program correctness. The societal impacts of this work stem from increasing reliability of software powering the national infrastructure, advanced educational methods to train future generations, and pedagogical material in the form of course notes and software for broad dissemination. It also helps maintain the United States in a leadership situation with respect to the available talent pool in this area.
Providing rigorous guarantees on correctness of existing parallel computing software requires that two classes of methods be developed, evaluated, and taught widely: scalable code-level (static) checking methods, and downstream detailed (dynamic) checking methods. This project develops these novel and much-needed correctness checking methods around the Habanero Java programming and compilation system. The research is to augment the system with correctness obligations emitted during compilation and checked at all later stages of translation and deployment. A key highlight of the project's approach is that it allows some of the correctness obligations to be checked statically in the context of safe subsets of Habanero Java. Obligations that are not able to be statically checked, especially for larger subsets of the Habanero language, are marked for checking dynamically through novel active-testing methods. An Operational Semantics written in the Coq notation lends cohesion to the work by ensuring that the division of correctness checking between static and dynamic techniques is sound. In summary, this research helps advance the science of parallel programming in terms of rigorous correctness checking methods, while at the same time contributing to the broad practice of programming at all scales from desktop to cloud computing and high-end scientific simulations.