CSR---PDOS: Online Production-Run Software Failure Diagnosis at the User Site

Zhou, Yuanyuan

Abstract

As software systems have grown in size, complexity and cost, it has become increasingly difficult to deliver software bug-free to end users, which result in many software failures during production runs at the user site. While much work has been conducted on software failure diagnosis, most previous work focuses on off-site diagnosis (i.e. diagnosis at the development site with involvement of programmers) and thereby is insufficient to diagnose production-run software failure at the user site.

To effectively address production-run failures, we propose a novel approach that automatically performs on-site software failure diagnosis right at the moment of a software failure and provide programmers a detailed diagnosis report regarding the occurred failure, including bug type, bug location, likely root cause, fault propagation chain, failure-triggering input, failure-triggering execution environment, potential temporal fixes, etc, without violating user?s privacy concerns or imposing large overhead during normal execution. To achieve the ambitious goal, the proposed research tightly integrates innovations from multiple layers: (1) Low-overhead operating and run-time system support to capture the failure moment without imposing large overhead during normal execution. (2) A novel, extensible, customizable, human-like failure diagnosis protocol. (3) Novel program analysis techniques that are specifically designed for on-site failure diagnosis. (4) Leverage existing and emerging hardware support and simple hardware extensions to reduce overhead.(5) A library-based API to allow applications to control or customize the diagnosis process if necessary.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Computer and Network Systems (CNS)
Application #: 1022830
Program Officer: Krishna Kant

Project Start
Project End
Budget Start: 2009-10-01
Budget End: 2012-08-31
Support Year
Fiscal Year: 2010
Total Cost: $569,486
Indirect Cost

CSR---PDOS: Online Production-Run Software Failure Diagnosis at the User Site
Zhou, Yuanyuan
University of California San Diego, La Jolla, CA, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments