Technology scaling has enabled tremendous growth in the integrated circuits (IC) industry over the past few decades. While Moore's Law seems to be going strong, fine line widths in current and future nanoscale technologies (e.g., 45nm and below) present several obstacles that have the potential to limit continued device scaling, curtail frequency improvements, and cause increased leakage power in future microprocessors. To make matters worse, voltage and temperature fluctuations arise from increasing power dissipation and techniques that attempt to reduce power. The combined impact of these variations forces designers to incorporate larger amounts of design margins in order to guarantee reliable operation. This proposal seeks to address the issue of variability in a cooperative fashion at both the circuit and architecture levels to ensure reliable operation of next-generation computing systems.
This proposal outlines work along three main research thrusts. The first thrust investigates circuit and architectural solutions to deal with the growing concern over reliability of on-chip memories and obtaining consistent levels of performance for each manufactured chip. These concerns arise from variations when manufacturing transistors in aggressively-scaled technologies. The second thrust seeks to minimize the aforementioned margins in critical parts of a chip via flexible circuit and architecture configurations that can accommodate variations. The third thrust leverages the first two efforts to understand the impact of variability and to offer guidelines for variability-tolerant chip-multiprocessor designs.
The explosion and pervasiveness of modern consumer electronic devices can largely be attributed to several decades of consistent progress in integrated circuit (IC) technology. This technological progress took shape with the ability to incorporate increasing numbers of ever-shrinking and faster transistors into a single computer "chip," commonly referred to as technology scaling. These transistors are the building blocks that allow for chips to perform complex computations and store vast amounts of information in a tiny form factor. While consistent shrinking of transistors is desirable—as these devices scale down to dimensions that can be measured at the level of tens to hundreds of atoms—numerous challenges have emerged. As devices have shrunk down to nanometer scales, the specifications of how each transistor operates now can fluctuate widely depending on the number and location of certain atoms within a device. Since current fabrication technology does not yet allow us to assemble individual atoms together to build exact replicas of the billions of transistors that go into a single chip, variation in transistor attributes has been growing, as they have gotten smaller. This project sought to address the issue of variability in state-of-the-art chips by understanding the nature of the variations, designing circuitry more immune to transistor variations, and exploring new computing structures (i.e., architectures), all with a broader objective to build faster computer chips that consume less energy. There are three notable outcomes from this project: Voltage interpolation is a novel way to connect transistors so that circuits can be tuned to operate with consistent performance despite the underlying variations described above. The solution allows clusters of transistors to connect to one of two power sources (i.e., voltages). A higher voltage allows circuitry to operate at higher speeds at the expense of higher power consumption. A lower voltage leads to slower operation with lower power consumption. By configuring how different intrinsically faster and slower clusters of transistors connect to the higher and lower voltages, a batch of computer chips can be tuned to operate at a higher average speed. Without this technique, chips would exhibit a wide range of speeds and require manufacturers to spec chips to operate at the lowest speed dictated by a worst-case cluster of transistors. Thread motion is a high-level, computer architecture solution to address the variation problem for future processors containing multiple cores. This solution builds on the premise that the underlying variation problem will lead to multiple cores within a chip to exhibit a range of top speeds. Instead of running all of the cores at the lowest common speed, thread motion moves program threads around between different cores—operating at different speeds—to improve overall performance of all the programs. This technique further provides the appearance that all cores operate uniformly at a common higher speed. DeCoR is an architectural solution that further seeks to address the problem of different forms of variability in modern microprocessor chips. Based on a detailed model of how power consumption of the internal circuitry interacts with the power delivery network within a chip, it implements a system of checking to make sure computation completes correctly or repeats computation in a reliable fashion. Detailed simulations show that costly corrective action is infrequent and, hence, overall system performance can improve. The above outcomes demonstrate the value of collaborative computer architecture and IC research, which has been gaining more attention by researchers in recent years. IC technology has been maturing, but continues to present new challenges. It has become important for researchers in related, but traditionally separate fields to join forces and find holistic solutions that can continue to improve computer performance. The students support through this project have graduated and gone on to full-time positions in US computer and IC companies (IBM, Intel, and Nvidia). One of them has subsequently transitioned to a full-time academic position.