Two trends in computation have conspired to make efficient exploitation of computational resources difficult. First, the applications for which domain experts are interested in deploying computational power--computational genomics, video processing, data analytics--are increasingly irregular, with patterns of computation and data access that are difficult to reason about. Second, the computational platforms that domain experts want to exploit are increasingly heterogeneous, built around accelerators that present vastly different performance characteristics, programming models, and efficiency tradeoffs. Most worryingly, these accelerators often require careful mapping of computation and data access to achieve maximum performance--exactly the task that is difficult in irregular computations. This project's novelties are creating new domain-specific languages (DSLs) to allow programmers to express complex computations in a high-level, easy-to-understand way, then mapping those DSLs to novel intermediate representations that allow programs in different domains to be expressed in a common representation for optimization. This intermediate representation will then be transformed using provably safe transformations to improve performance without sacrificing correctness guarantees. Finally, the transformed programs will be automatically tuned for accelerators, with that tuning dependent on the specific resource profile of the accelerator and the resource demands of the application. This project's impacts will be unlocking the power of accelerator-based platforms to a broader group of programmers and scientists and providing a DSL and the associated compilation framework to two rich problem domains (streaming video processing and computational genomics) and to new accelerators.

While different problem domains require different abstractions to effectively capture their computations--string-processing kernels for computational genomics, filtering and transformation kernels for video processing--these abstractions can often effectively be mapped to a common intermediate representation that nevertheless captures high-level properties of program execution such as data-access patterns and parallelism properties. This intermediate representation forms the basis for domain-agnostic transformations that can restructure computation to templates that fit different accelerator paradigms (for example, choosing wide parallelism for accelerators like Graphics Processing Units (GPUs), or narrower parallelism for vector units in multi-cores). This project will then use machine learning to develop accelerator models that allow these templates to be instantiated with accelerator-specific parameters for each application (e.g., tuning block size in a GPU kernel), and therefore maximize performance. Finally, to ensure that this transformation and tuning pipeline is sound, this project will develop novel verification techniques to ensure that each translation preserves correctness. These tools will allow programmers to write accelerator programs without concerning themselves with the details of the accelerators they are targeting for correctness or performance.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Type
Standard Grant (Standard)
Application #
1919197
Program Officer
Anindya Banerjee
Project Start
Project End
Budget Start
2019-10-01
Budget End
2023-09-30
Support Year
Fiscal Year
2019
Total Cost
$1,250,000
Indirect Cost
Name
Purdue University
Department
Type
DUNS #
City
West Lafayette
State
IN
Country
United States
Zip Code
47907