Modern high-performance systems are chronically underutilized not because their job queues are short, but because each task uses just a small fraction of the power available on each node. Compiled code simply does not achieve a reasonable fraction of the available performance, despite years of work in academia and in research labs that aimed to improve the situation. Some of the performance gap is due to architecture and design; however, a portion of it is directly attributable to the design of optimizing compilers. Standard optimization techniques fail to discover and fix such problems, in part because they operate at a procedure-level granularity and in part because the remedy requires the compiler to change its approach to optimizing that part of the code to use another order or another transformation scheme.
Intellectual Merit
In this project, the PI will develop a feedback-driven adaptive compilation harness that specifically targets improvements in small regions of underperformance. To make the regional scheme work, his team will enhance and reformulate two powerful optimizations, rematerialization and reassociation, and re-scope others so that they operate on a regional rather than a global scope.
Broader Impact
This project will directly support graduate education at Rice. The positive research results will be taught in a Rice graduate course on the subject; the material will reach a broad audience because the teaching materials for the course are used at other institutions. The work in this project will add to the code base for the Rice University research compiler, which is distributed free of charge in source-code form to others in both academia and industry. The PI will use a combination of local funds and directed research projects to include undergraduates directly in this work.