The improvements in manufacturing processes for silicon integrated circuits, through the reduction in transistor and interconnect sizes, also give rise to integrated circuits that will wear out if not designed and used properly. That is, they will fail to continue to work over a long period due to factors such as increased operating power and temperature, and un-ideal scaling effects as the sizes are reduced. The number of failures, when left uncompensated, is expected to grow exponentially. Thus, the increasing wear-out failure rates will require design-for-lifetime reliability methodologies for all systems, not just those with safety critical missions.
A novel design-for-reliability methodology to aid in the design of cost-effective lifetime-enhanced systems is being developed for embedded computing applications. The systems being designed are multiprocessors that are interconnected with a network on the chip. Network switches in the network are used to route information among the processors. The design methodology includes choosing the number of switches and their interconnection, choosing the types of processors and memories to be connected to the switches, selecting which processors the software programs will run on, and selecting the voltage and clock rate that each of the processors will operate at. Each of these design decisions can have a major impact on the lifetime of an integrated circuit.
The design methodology requires searching through an enormously large set of design alternatives, each requiring a computationally expensive calculation of the mean-time-to-failure for the system being designed. An ant colony based optimization technique is used to explore this large design space and hardware acceleration techniques, such as graphics processing units and field programmable gate arrays, are used to speed the mean-time-to-failure calculations.
The application of embedded single-chip multiprocessing systems will continue to grow across a broad spectrum of applications from personal computing devices, to biological monitoring systems, to consumer electronics. This continued growth in embedded computing applications is a major driver in the economy as it improves personal productivity. This research enables deeper scaling in the manufacturing process, and thus more capable systems, since these new integrated circuits will continue to be implementable even when individual components within them are expected to fail early or possibly are not fully working when manufactured. The research results will be disseminated through graduate courses and research articles, with some topics being introduced into a Junior/Senior level course.