This Small Business Innovation Research Phase I project develops a low-cost, software-oriented, distributed fault tolerant mechanism. The approach taken is through process monitoring, fault detection, checkpoint and recovery. The system is implemented on a distributed system to capitalize on the inherent fault tolerance of a distributed system. Communication mechanisms, process migration, and distributed monitoring methods are developed. This project proposes an object-oriented software methodology for developing fault tolerant applications and a software developer's toolkit to facilitate programming. This project investigates an open process communication architecture to enhance the interoperability of the system and to develop a high-level command language with which users can specify system parameters and control the system actions. The system will handle both hardware and software faults that are not handled at the hardware and operating systems levels. A prototype will be developed to demonstrate the feasibility of the concepts.