Applications of Failure Detection

Halpern, Joseph; Toueg, Sam

Abstract

Unreliable failure detection has been gaining acceptance as a fundamental paradigm for fault-tolerant distributed computing. This paradigm was originally introduced to circumvent the impossibility of achieving consensus in asynchronous systems with crash failures. The proposed research aims to extend the applicability of failure detection in multiple ways: (1) It will seek solutions that tolerate both process crashes and link failures, and in particular, solutions that are resilient to network partitioning. (2) It will consider practical problems besides consensus, e.g., atomic commitment and various forms of group membership. (3) It will investigate the extension of failure detection to other models, such as the timed asynchronous model, in order to solve problems whose specifications involve real-time. (4) It will explore the use of randomization techniques to enhance the power of failure detection. This research intends to widen the scope of failure detection and to firmly establish it as a core component of fault-tolerant distributed systems.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Computer and Communication Foundations (CCF)
Type: Standard Grant (Standard)
Application #: 9711403
Program Officer: Mukesh Singhal

Project Start
Project End
Budget Start: 1997-09-01
Budget End: 2001-08-31
Support Year
Fiscal Year: 1997
Total Cost: $230,000
Indirect Cost

Applications of Failure Detection
Halpern, Joseph Toueg, Sam
Cornell University, Ithaca, NY, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments