The ubiquity of multi-core processors has brought parallel computing squarely into the mainstream. It is now essential to develop parallel implementations of a large number of existing sequential codes. The difficulty of programming these architectures to effectively tap the potential of multiple on-chip processing units is a significant challenge. Although there has been significant progress in compiler techniques towards automatic parallelization, the current state-of-practice leaves much to be desired. The pressing need for systematic, general, and effective theoretical foundations for such efforts is a major motivation for this project.
This project will build on some very recent developments using polyhedral models showing great promise for developing effective automatic parallelization frameworks for multi-core architectures. With the polyhedral model, it is possible to reason about the correctness of complex loop transformations in a completely mathematical setting using powerful machinery from linear algebra and linear programming. This enables effective integrated transformation, and therefore can be the basis for developing a very powerful automatic parallelization framework that can target different multi-core platforms. The project will address a number of key issues that are very important in developing an automatic parallelization and data locality optimization framework that is effective over a range of user application codes: (i) model-driven search for determination of effective tile sizes and loop fusion choices; (ii) extended tiling approaches like overlapped/split tiles to enhance concurrency; (iii) automatic generation of parallel code for accelerators with multiple distinct address spaces; and (iv) development of an extensive benchmark suite for assessment of automatic parallelization systems. The developed software will be made publicly available.