Recently, the microprocessor industry has moved toward multicore microprocessor designs as a means of utilizing the increasing transistor counts in the face of physical and micro-architectural limitations. Unfortunately, providing multiple cores does not directly translate into performance for most codes. To make use of multicore, many new languages have been proposed to ease the burden of writing parallel programs, yet the programming effort involved in creating correct and efficient parallel programs is still far more substantial than writing the equivalent single-threaded version. A more attractive approach is to rely on tools, both compilers and runtime optimizers, to automatically extract threads from sequential applications. Unfortunately, despite decades of research on automatic parallelization, most techniques have only been effective in the scientific and data-parallel domains. With recently gained insight, the investigators showed that the limits of prior thread-extraction approaches are not fundamental. By applying known and new compilation techniques in a systematic manner, the investigators found that SPEC CINT2000, among the most sequential of codes, has abundant scalable parallelism.
In this project, the team of investigators is taking the initial steps toward developing the techniques necessary to build an automatic thread extraction framework. These techniques include developing static transformations that extract parallelism and quantifying the opportunities for dynamic optimization. The system will ultimately consist of a series of static transformations and compiler-inserted hints combined with a run-time optimization component. This framework will address the multicore challenge by reliably extracting parallelism from a wide range of applications without burdening the programmer with what should remain to be low-level implementation details.