This CAREER research project aims to significantly improve the design productivity and quality of heterogeneous computer architectures, which extensively integrate specialized hardware accelerators to continue to provide the computing improvements essential to all aspects of our society. Achieving this goal requires the development of a new class of truly integrated design automation methodologies and tools to enable productive modeling, exploration, and generation of hardware accelerators from high-level programs, especially for the irregular programs that are commonplace in emerging application domains such as computer vision, machine learning, physical simulation, and social network analytics. The project also has a broad yet thematically focused plan for educational outreach, which aims to cultivate the next generation of engineers and scientists who can bridge the chasm between the software and hardware design paradigms. The PI will lead hands-on design sessions for underrepresented minority high school students and organize engineering seminars with engaging demonstrations for first-year undergraduates to increase their interest and participation in computer engineering. In addition, the PI will actively integrate the research outcomes into undergraduate and graduate curriculum development, and leverage industrial collaborations to effectively disseminate the research results on heterogeneous computing to a broader audience.
Diminished benefits of technology scaling have led to a growing interest in heterogeneous accelerator-rich system architectures to improve performance under tight power and energy efficiency constraints. Irregular programs are gaining prominence in many important application domains; but these programs are much more difficult to parallelize on conventional data-parallel accelerators such as GPUs, as they typically exhibit less-structured data access patterns and difficult-to-predict dynamic parallelism. This project aims to develop a synergistic design automation framework where a set of novel programming abstractions, architectural templates, synthesis optimization algorithms, and hardware prototypes all play concerted roles to overcome the many challenges raised by the irregular programs. Specifically, the key idea is to automatically generate softly synthesized accelerators that are capable of decoupling data access from computation for tolerating memory latency and performing run-time optimizations for exploiting the irregular parallelism.