Recording and deterministically replaying execution gives system designers the ability to travel backward in time. Time travel in computer systems means the ability to recreate arbitrary past states and events on the computer system. In general, recreating past states and events is achieved by logging key events when the software runs, then restoring to a previous checkpoint and replaying the recorded log to force the software down the same execution path deterministically. This alluring mechanism has enabled a wide range of applications in modern systems including debugging programs, performing post-hoc security analysis, and improving fault tolerance. To maximize effectiveness, replay systems should (i) record at production-run speeds, (ii) keep logging requirements minute, (iii) replay at a speed similar to that of the initial execution. Software-only systems for deterministic replay achieve these goals for uni-processor systems, but suffer from poor performance on multiprocessor systems. Hardware-only systems record and replay efficiently on multiprocessor systems, but current proposals for hardware-only systems are largely impractical since they place too much functionality within the hardware and because they do not mix recording, replaying, and traditional execution on the same system concurrently. This research will focus on the design and implementation of Capo, a hybrid hardware/software replay system that will record and replay execution on multiprocessor systems efficiently. The key contribution of this research will be designing and implementing the first hardware/software interface for combining hardware and software level replay systems. This interface will serve as the foundation for a new generation of replay systems that will achieve both the flexibility of software-level replay systems and the efficiency of hardware-level replay systems.
If successful, this research will have substantial impact on industry, by enabling the effective use of deterministic replay of parallel codes. The team will release all software artifacts as open source, which will help researchers and educators in many institutions.
In addition to their research contributions, this team will enhance a series of courses and expand their course offerings in parallel systems - especially at the undergraduate level. The PIs have a long-standing commitment to undergraduate education, routinely involving undergraduates in their research and exposing them to parallel software and hardware. The PIs will continue to involve undergraduate and graduate students in their research groups. This project is compelling to undergraduates because it involves the interaction of two different system layers.
In computer systems, being able to deterministically record and replay a computer program gives computer system designers the ability to recreate the past for their computer programs. This time traveling mechanism has a wide range of applications, including making computer systems more robust and reliable, and making computer programs more secure. This project defined the fundamental principles behind applying time travel to computer systems using a combination of novel hardware, novel software, and the first hardware/software interface specifically for facilitating record and replay. These principles were applied to a wide range of software, from low level operating systems to high-level web-based applications. The artifacts developed as a part of this research had a major impact on industry through technology transfer. First, this research resulted in a hardware prototype for record and replay in collaboration with Intel. Intel is considering using this new hardware and hardware/software interface in their processor products. Second, the technology developed in this grant led directly to the formation of a startup, called Adrenaline Mobility, which was created by two of the participants in this grant. Specifically, the technology for recording and replaying web-based applications and for building secure operating systems were used directly by Adrenaline Mobility.