Successful use of computing technologies has enhanced quality of life. In fact, it is hard to point to even one aspect of societal functioning that is not impacted positively by computer systems. The tremendous advances in computer system performance while simultaneously lowering the computing cost have been the primary reason for the success of computing. However, shrinking device sizes have recently lead to new challenges in system reliability. System reliability is an uncompromising concern and is the primary focus of this proposed research. Rather than addressing one single reliability concern, such as soft errors or process variations, this proposal takes a multi-dimensional approach to improve Mean-Time-To-Failure (MTTF). Reliability degrades extremely slowly over time and hence the solutions proposed in this research are also low cost solutions. In order to develop low cost solutions, the first step is to non-intrusively monitor the health of a processor to understand its aging process before taking proactive measures for detecting errors. When the error is detected, instead of employing expensive hardware solutions, this proposal uses low cost and flexible software mechanisms to correct the errors. While any one approach will certainly extend MTTF, the true benefits of the proposed research will bear fruition when error monitoring, detection and correction are employed in a hierarchical framework based on reliability needs.

The broader impacts of this proposal are on two fronts. The proposed research is motivated by industrial concerns regarding system reliability. On the technology front, the low cost solutions developed will be transferred for industry adoption through close industry-academia interactions. Most of the proposed research ideas will be designed and implemented by the research team as research prototypes. These prototypes will be shared with industrial partners for further evaluations in an industrial setting. Woman and minority student recruitment will be one of the key driving force to encourage broader participation in the proposed research. This objective will be achieved through active participation and involvement of USC's Family of Schools in Los Angeles.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
0954211
Program Officer
Hong Jiang
Project Start
Project End
Budget Start
2010-08-01
Budget End
2015-07-31
Support Year
Fiscal Year
2009
Total Cost
$331,308
Indirect Cost
Name
University of Southern California
Department
Type
DUNS #
City
Los Angeles
State
CA
Country
United States
Zip Code
90089