Due to technology scaling and the increased susceptibility of ultra deep submicron (UDSM) circuitry to uncertainties originating from noise (reduced noise margins) and soft errors (induced by atmospheric neutrons), it will become necessary to design error detection and correction capability into future logic designs for reliable computation. In the past, data encoding techniques have mostly been used for error detection and correction in wired and wireless communications channels. In order to enable reliable computing in nanoscale technologies of the future, not only data but also computation performed on the data will need to be encoded for real-time error detection and correction (i.e. redundant computations will need to be performed). It is expected that error rates of combinational logic in scaled technologies will escalate by 9 orders of magnitude from 1992 to 2011, when it will equal the error rate of "unprotected" memory elements. Currently, coding techniques are used to design reliable memory banks and enable reliable memory access, but their use in on-chip signal processing has been limited. One of the key barriers to widespread use of coding techniques for reliable on-chip computing is the cost of data and circuit redundancy necessary to implement a coding technique with logic error detection and correction capabilities for single and multiple bit-errors. While error detection is accomplished relatively easily across the majority of known algorithm-based and communication systems coding techniques, error correction is a harder problem and can require significant computation for exact error correction. This renders real-time correction without loss of significant throughput difficult, if not impossible, to achieve and is especially true for the majority of DSP applications that involve matrix-vector multiplications and are the core subject of this proposal. In this context, it is important to point out that in future scaled technologies with high error rates, rapid error correction with least impact on throughput will be a critical technology enabling factor. Without this capability, technology scaling itself may grind to a halt due to gross loss of circuit and system level performance. This research focuses primarily on coding and probabilistic correction techniques for on-chip linear and non-linear digital signal processing computations that allow near-exact correction to be performed with minimal impact on circuit performance and power consumption in systems composed of unreliable components that generate intermittent errors on their output lines at much higher error rates than can be handled by exact error correction techniques without significant loss of performance The increased error rate is assumed to be driven by very aggressive technology scaling issues. The errors are due to reduced noise margins in UDSM circuitry, power/ground bounce and radiation-induced effects or due to permanent failures (stuck-at-0/stuck-at-1) on internal signal lines that are excited intermittently by real-time stimulus. In addition, the proposed techniques can be applied to special classes of analog circuits as well (filters, amplifiers, etc). . It is expected that the proposed research will open up new avenues for addressing the challenges of performing reliable computation with unreliable hardware (high error rates), a problem that is expected to dominate the field of reliable computing in the coming decade.

Project Start
Project End
Budget Start
2006-10-01
Budget End
2010-09-30
Support Year
Fiscal Year
2006
Total Cost
$218,044
Indirect Cost
Name
Georgia Tech Research Corporation
Department
Type
DUNS #
City
Atlanta
State
GA
Country
United States
Zip Code
30332