Shrinking fabrication feature sizes and the increasing proliferation of mixed-signal and 3D integration are elevating rates of faults, variation, and aging related degradation in electronic integrated circuits, threatening the reliability of communication between processing cores in system-on-chip (SoC) architectures. The goal of this project is to realize system-level computer-aided design (CAD) automation techniques and tools to assist chip designers to trade off reliability with competing design constraints for on-chip interconnection networks within tight time-to-market constraints. This novel framework will exploit cross-layer insights about the software application, hardware intellectual property blocks, and circuits, as well as knowledge of key factors impacting susceptibility to runtime faults for network routers and interfaces. By achieving reliability goals and multi-objective design trade-offs for on-chip interconnection network fabrics with orders of magnitude lower time complexity and overhead than is possible today, this project will transform the design of multi-core SoCs that already permeate most facets of our daily lives.
The research will drive a tightly integrated education plan to inspire K-12 students toward STEM careers, ensure workforce continuity, and increase participation of veterans, undergraduates, and women via capstone projects and distance education initiatives. A new course on fault tolerant chip design will be created and existing courses on computer architecture and embedded systems will be enhanced with reliability-centric components. By exposing graduate students to diverse aspects of CAD algorithms, SoC architectures, and parallel applications, the educational component of this project will contribute to an agile high-tech workforce that will maintain continued US leadership in technological innovation.