Future integrated circuits will contain tens, hundreds, or even thousand cores per chip. However, technology downscaling that can make this possible may also make the underlying hardware less reliable due to an increasing number of defects and wear out mechanisms. Therefore, one of the major problems facing the design of multiprocessor systems-on-chip is reliability. Because either the cores or the network-on-chip (used for communication between the cores) can become a reliability bottleneck for these systems, it is imperative that the reliability be addressed in a unified manner. To address the reliability challenge, this research develops a novel unified theoretical lifetime reliability modeling framework. This framework is based on efficient Monte Carlo methods to treat multiprocessor systems-on-chip as a combination of computation and communication units. The goal of this research is to develop new dynamic reliability management techniques based on dynamic voltage and frequency scaling and application remapping. Based on control theory concepts, these techniques proactively improve the lifetime reliability of multicore systems.

The proposed dynamic reliability management techniques enable the development of more reliable multiprocessor systems-on-chip, which have a dramatic impact on society via applications ranging from entertainment and gaming to bio-engineering, military and space. More broadly, the results of this project impact significantly the design of future integrated systems by advancing the understanding of the tradeoffs between reliability as a new design concern and power consumption, performance and area as traditional objectives.

Project Start
Project End
Budget Start
2011-09-01
Budget End
2012-10-31
Support Year
Fiscal Year
2011
Total Cost
$229,193
Indirect Cost
Name
North Dakota State University Fargo
Department
Type
DUNS #
City
Fargo
State
ND
Country
United States
Zip Code
58108