Accelerating single programs on multicore processors remains an outstanding challenge in computer systems design. Unfortunately, existing parallel systems achieve little speedup on programs other than regular dense-matrix codes. And, most of the world's programs are in this category, broadly termed non-regular code. Of course some non-regular codes have little parallelism beyond instruction level parallelism (ILP); hence no speedup is possible on multicores. However in other non-regular code, parallelism is present but is not exploitable. Reasons include high synchronization costs, non-loop parallelism, non-array data structures, recursively expressed parallelism and parallelism that is too fine-grained to be exploitable.
Previous work by the PIs presented the PRAM-based XMT parallel architecture which has demonstrated good speedups on non-regular codes: 23X on breadth-first search in graphs and 9X for finding spanning tree in graphs, using 64 processors vs. the best-in-class serial processor.
This project is developing new compiler technologies for XMT to achieve scalable performance in the face of architecture decisions made for scalability. It is studying better compiler techniques to achieve scalable performance for UMA architectures such as XMT. These include better task schedulers using global queues rather than work stealing; improved pre-fetching tailored for XMT's unique memory hierarchy; and using scalable non-cache-coherent Scratch-Pad Memory local to each XMT processor to reduce the need to go to expensive remote memory.
The broader impacts of this project are (i) the development of compiler technologies necessary to reduce the research risk of XMT to the point where industry is willing to commercialize the technology; (ii) the delivery of scalable speedups for erstwhile hard-to-parallelize applications; (iii) the demonstration of technologies for robust performance across large classes of serial, regular parallel, and non-regular parallel programs; (iv) demonstrating a serious contender for a future universal desktop architecture; and (v) educational and outreach initiatives to popularize XMT and improve the skills of the future workforce.