As processor microarchitecture has evolved over the past 30 years, both the complexity of the design and the number of transistors used in its realization have escalated. This complexity makes guaranteeing correct operation under all corner cases (logical as well as electrical) an ever more challenging task. This, in turn, translates in to higher and higher practical barriers for microarchitectural innovation for performance optimization as correctness is an overriding consideration. The goal of this CAREER proposal is to explore a new design model called Performance-correctness Explicitly-decoupled Architecture. From ground up, an explicitly-decoupled architecture is designed such that performance optimization circuitry is an independent entity separate from the circuitry guaranteeing functionality and correctness. This explicit separation allows performance optimization to be truly considered in a common-case-only manner, allowing the use of probabilistic techniques considered impractical or even incorrect in a conventional monolithic microarchitecture. This project seeks to develop insights to better understand this design model and develop complexity-effective microarchitectural and software mechanisms for performance optimization.

Project Report

Computation has long been a central technology for the modern society. Our reliance on it seems to only increase as computers become more and more powerful. However, in a recent industry reflection point, we have seen much slower improvement of the performance of a single processor core. Instead, the performance gain is largely in the form of more and more cores in a single chip. While in theory, all programs can be written to take advantage of many cores available in a single chip, it is not as easy in practice – efficient parallel programming has been a real challenge for the past 50 years or more. The major goal of this CAREER project is to understand a special single-thread performance improvement technique, which we call decoupled architecture. The architecture allows performance improvement (via look-ahead) to be separated from the correctness guarantee in executing the program, permitting efficient techniques for deep look-ahead. We seek to develop a whole host of techniques to significantly increase the look-ahead effectiveness. The reason we are pursuing this technique is that the main challenge of improving performance is not the lack of ideas, but that most ideas are simple in concept but require extraordinary efforts to ensure their correctness in the unlikely (but possible) scenarios that we call corner cases. Corner case correctness requirement is absolute. After all, few real-world problems allow running a program a bit faster at the expense of possibly getting things wrong – and never knowing exactly when it does go wrong. The gist of our technique is simple, if we only use the fast-but-possibly-wrong execution as a source of hint to smooth things out in a real execution, we don’t have to worry about corner cases. On the hand, if we have a semi-oracle custom built for a particular program telling us what the future of the execution is like, we get far more bang for buck compared to using big, expensive hardware to predict future in a generic way as the state of the art does. Throughout this project, we tried many different little and not-so-little tricks, accumulating good ideas and learning lessons from those ideas that don’t pan out as hoped. In the end, we have a prototype system that can automatically improve a program’s execution significantly (up to several times faster). Moreover, we have also accumulated a large number of ideas that we believe to be better guesses than what we’ve tried in the project. No less important than the techniques we developed (and those are all documented and published in computer architecture literature) is the training of PhD students (two students theses are on this topic) both working in industry team now applying both the technical and methodological lessons learned from their research. These are future leaders that will help keep our industry the best in the world.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
0747324
Program Officer
Hong Jiang
Project Start
Project End
Budget Start
2008-06-01
Budget End
2014-05-31
Support Year
Fiscal Year
2007
Total Cost
$393,350
Indirect Cost
Name
University of Rochester
Department
Type
DUNS #
City
Rochester
State
NY
Country
United States
Zip Code
14627