High-performance computing suffers from a performance bottleneck that wastes computation time, money, and energy, as processing cores on multicore systems sit idle waiting for memory accesses from irregular kernels. These irregular kernels normally accomplish little computational work despite the high cost of accessing memory. These costly bottlenecks must be remedied by a new approach to high-performance computing. But, at the same time, computing is evolving and is becoming less dependent on the low-level programming languages that cause these bottlenecks and more dependent on learning algorithms such as neural networks to attain the necessary efficiency. This project builds the foundation for accelerating irregular kernels by replacing them with neural networks that run on accelerators optimized for neural networks. These neural networks offer better performance and energy consumption. Additionally, these networks are tuned in high-level programming languages (e.g., Python) that are easier for novice users to learn. This allows more computer scientists to aid the scientific and high-performance computing communities. This project also builds a new curriculum such as adding neural accelerators and expanding neural network algorithm materials into traditional undergraduate courses. This project, in both its research and educational aspects, significantly reduces the development time and costs of high-performance computing while simultaneously reducing performance bottlenecks. Furthermore, this project will support graduate and undergraduate students as they engage in cross-disciplinary involvement to match accuracy and performance constraints from the scientific-modeling and big-data-analysis communities that currently depend on irregular kernels for areas such as climate modeling, large scale circuit design, and drug analysis on infectious diseases.
The goals and scope of this project are to build a framework that allows irregular kernels to be optimized in terms of both their performance and energy usage using the technique of neural acceleration, i.e., being represented and executed as a neural network. The methods used to meet the project’s goals and scope include the following: 1) The development of an approximation-bound characteristic that quantifies and qualifies acceptable error bars on the developed neural networks along with performance and energy requirements; 2) The development of initial neural networks for commonly used irregular kernels that can be used as starting networks for more complex irregular kernels and be used by individuals tuning their irregular kernels (which will be made available by a public database that is created and maintained by the investigator to support research in this area); and 3) The construction of a toolchain to aid in identifying irregular kernels in code, constructing neural networks based on user input, and deciding how the neural networks should be scheduled. The deliverable toolchain has support for popular libraries like TensorFlow and will be disseminated via an open-source repository. The transformative impact of this project’s effort generates a completely new optimization option for irregular kernels and a base set of tools (i.e., a public database and scheduling toolchain) that will foster future advances into using neural acceleration for various codes and lead to significant advancements in science and engineering. As such, this new optimization option may inspire a new computational model in a post-Moore era that provides timely scientific data for urgent government policy, such as climate change and foreign affairs. This project is jointly funded by CAREER Software and Hardware Foundations HPC program and the Established Program to Stimulate Competitive Research (EPSCoR).
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.