Electronic systems are an indispensable part of everyday life. Malfunctions in these systems have consequences ranging from annoying computer crashes, loss of data and services, to financial and productivity losses, or even loss of human life. For coming generations of electronic systems manufactured using advanced silicon technologies, several hardware failure mechanisms, largely benign in the past, are becoming visible at the system-level. As a result, a vast majority of future electronic systems, handhelds and servers alike, will require resilience to hardware failures during their operation. Traditional redundancy techniques impose significant costs, and are clearly impractical for general usage. This proposal addresses this outstanding challenge: how to create systematic approaches to achieve the required level of resilience at minimal power, performance and area costs?
In addition to educating new generations of students and semiconductor industry professionals on the emerging resilience topics, this project will enable the broader research community to answer several key questions for which satisfactory answers don't exist today. Quantitative answers to these questions will enable the industry to create next generations of electronic systems with highly cost-effective resilience. As a side benefit, the results obtained from this project can help overcome challenges associated with post-silicon validation and debug of complex electronic systems of the future.