This research should further our understanding of how to build reliable distributed systems. The proposed checkpointing algorithm emphasizes practicality and should show that checkpointing is a practical way to provide reliability in a working, general-purpose, distributed system. The algorithm possesses a set of desirable properties that are crucial in such a system. Novel solutions are provided to satisfy the conflicting requirements of these properties. The algorithm also shows how checkpointing can be incorporated with other well-established concepts in distributed systems, such as atomicity and a modular architecture using shared memory for efficient intra-node communication and message-passing for inter-node communication.