The most important task for scientific computing is to plan the way forward from the current era of multicore microprocessors implemented in deeply submicron CMOS. This has to be done in such a way that performance improvements are guaranteed. Otherwise the escalating costs of fabrication at the nanometer scale will not be sustainable. Multiple core microprocessors face several major problems, not the least of which is codified in Amdahl?s Law. This law states that even if multiple core processors can reduce all parallelizable code to vanishingly small run time, what will remain is serial code. Even relatively small amounts of serial code then limit the gains possible. Consequently it is prudent to pursue a balanced technology approach, in which both parallelizable and serial codes enjoy advantages. For serial code, assuming all instruction level parallelism (ILP) is fully exploited, the only way forward is to execute the serial code on a higher clock rate unit, or HCRU. Since clock rates for CMOS have tended to saturate due to wire scaling problems and excessive heat dissipation, one must look to an alternate three terminal device, but which is compatible with CMOS. This project explores whether the solution lies with an overlooked device, an aggressively scaled SiGe Heterojunction Bipolar Transistor (HBT). The claim for the research to verify is that clock rates of 20-30 GHz are possible at reasonable power levels and densities, but 3D technology is needed to mitigate memory wall problems.

Project Report

Since 2005 clock rates for computers have stalled at roughly 4 GHz (5.5GHz for the IBM z-196). Prior to 2005 clock rates improved roughly by 2X for every 33% lithographic shrink, which took place at the Moore's Law pace of roughly every two years. So by 2013 there should have been four 2X speedup cycles and we would be at 64 GHz. Several things have interrupted this pace, but one of them is not the innate device speed which by 33nm had reached a fT of 300GHz for the n-channel device. Instead two issues arose. One was the increase in wire resistance due to poor scaling properties of wire, the second was the power penalty of going faster if in fact one could defeat the wire resistance problem. The first project sponsored under our EAGER program was to demonstrate that several basic building blocks of a computer could approach 32GHz in operating speed, albeit with vertical SiGe HBT bipolar devices. We managed to attain 27 GHz for a 32-bit adder and similarly 27 GHz for a 32-bit register file. In addition the register file could be used as a Level Zero cache (L0). The L0 could be parallel cache line transfered to a 4GHz BiCMOS SRAM for L1, another circuit which was verified. The initial program was for 2 years. This has led to a renewal grant in which (a) the impact of a High Clock Rate Unit (or HCRU) could be quantified with real software benchmarks using a commercial simulator (SIMICS). In addition the next generation of SiGe HBT Is being used to go even faster but possibly at a lower power using a new process at IBM called 9HP. The 9HP SiGe HBT process exhibits a 300GHz fT but due to its exponential turn on characteristic that device will drive wire better when more levels of wire are exploited ib the wiring stack. This faster HCRU will address the challenges of Amdahl's Law.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Network Systems (CNS)
Type
Standard Grant (Standard)
Application #
1031440
Program Officer
Krishna Kant
Project Start
Project End
Budget Start
2010-05-01
Budget End
2012-12-31
Support Year
Fiscal Year
2010
Total Cost
$299,759
Indirect Cost
Name
Rensselaer Polytechnic Institute
Department
Type
DUNS #
City
Troy
State
NY
Country
United States
Zip Code
12180