In the world of high-performance scientific computing, the rapid emergence of hybrid processors that make heavy use of accelerator technologies, such as Graphics Processing Units (GPUs) or the Intel Xeon Phi (a.k.a., Many Integrated Cores, MIC), raises critical new challenges for computational scientists. Their research applications typically depend on computational kernels (i.e., software implementations of one or more of the basic patterns of scientific computing) that are optimized for speed. Such programs spend most of their computing time executing one or more of these kernels, and long experience has taught developers that tuning their kernels for the architecture of a given processor is absolutely essential to achieving excellent performance at the level of the individual computing node. Since scientists want to run these applications on supercomputers with thousands of such nodes, high performance at the node level is essential to high productivity for the application at large. Unfortunately, for the vast majority of computational kernels, the three classic approaches to performance tuning?compiler-driven code transformations, low-level manual programming, or empirical autotuning?have always been very difficult, often producing mixed results; and the emerging era of hybrid processors makes all three techniques less effective still. The Bench-testing Environment for Automated Software Tuning (BEAST) makes a substantial contribution to solving this important problem. BEAST creates a framework for exploring and optimizing the performance of computational kernels on hybrid processors that 1) applies to a diverse range of computational kernels, 2) (semi)automatically generates better performing implementations on various hybrid processor architectures, and 3) increases developer insight into why given kernel/processor combinations have the performance profiles they do. To achieve this three-fold goal, it applies the model used for traditional application benchmarking in a completely novel way: it combines an abstract kernel specification and corresponding verification test, similar to standard benchmarking, with an automated testing engine and data analysis and machine learning tools, called the BEAST workbench. Using a new method for specifying language-neutral code stencils and a prototype BEAST workbench, the project explores alternative tuning methods and strategies for a diverse range of computational kernels. Experiments carried out under this project are expected to show that the BEAST framework can dramatically improve the performance of many computational kernels that are of fundamental importance to scientific computing. As this software and the techniques for using it are made widely available to the science and engineering community, they will help to ensure the timely delivery of performance- optimized kernels for many domains and many types of hybrid processors, making the impact of the BEAST bench-tuning software infrastructure very broad indeed. Scientists and engineers, across a vast array of intellectually, economically and socially important domains, will be able to rapidly tune the underlying kernels in their applications to the characteristics of the latest platform, and thereby quickly gain the productivity benefits of each successive generation of accelerator technology.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Type
Standard Grant (Standard)
Application #
1320603
Program Officer
Almadena Chtchelkanova
Project Start
Project End
Budget Start
2013-08-01
Budget End
2016-07-31
Support Year
Fiscal Year
2013
Total Cost
$499,995
Indirect Cost
Name
University of Tennessee Knoxville
Department
Type
DUNS #
City
Knoxville
State
TN
Country
United States
Zip Code
37916