The ability to run programs deterministically, so that re-execution always yields identical results, is useful for many purposes: e.g., replay debugging, intrusion analysis, fault tolerance, byzantine accountability, and timing channel control. Running parallel programs deterministically is traditionally difficult and costly, however, especially if we wish to guarantee precise repeatability even of arbitrarily buggy or malicious software.

Determinator is a novel operating system that enforces determinism on multithreaded and multi-process computations, parallelized both across cores in one machine and across nodes in a cluster. The kernel provides only single-threaded, ``shared-nothing'' address spaces, interacting via synchronization primitives that enforce deterministic behavior on all user-level code. Nondeterministic inputs and observable notions of time - including clocks, timers, cycle counters, and timing-dependent internal communication channels - are accessible only via controlled I/O mechanisms, giving supervisory software precise control over how and when nondeterministic information may affect a supervised computation. Atop this constrained kernel API, an untrusted runtime uses distributed computing techniques to emulate familiar abstractions such as Unix processes, file systems, and shared memory multithreading.

By building and evaluating this experimental OS architecture, we hope to discover: (1) whether OS-enforced deterministic execution can be made practical and performance-competitive with conventional OS environments, even for massively parallel applications; (2) how to emulate conventional nondeterministic APIs and run legacy software deterministically with few modifications; and (3) how to create new, "naturally determinisic" parallel programming APIs, offering powerful but easy-to-use abstractions for expressing parallelism, while guaranteeing predictable and precisely repeatable results that are independent of execution scheduling.

Project Report

The Determinator project at Yale built an experimental multiprocessor, distributed operating system that creates an environment in which anything an application computes is exactly repeatable. Determinator consists of a microkernel and a set of user-space runtime libraries and applications. The microkernel provides a minimal API and execution environment, supporting a hierarchy of "shared-nothing" address spaces that can execute in parallel, but enforcing the guarantee that these spaces evolve and interact deterministically. Atop this minimal environment, Determinator's user-space runtime library uses distributed systems techniques to emulate familiar shared-state abstractions such as Unix processes, global file systems, and shared memory multithreading. A subset of Determinator comprises PIOS ("Parallel Instructional Operating System"), a teaching OS derived from and providing a course framework similar to JOS, where students fill in missing pieces of a reference skeleton. Determinator/PIOS represents a complete redesign and rewrite of the core components of JOS. To our knowledge PIOS is the first instructional OS to include and emphasize increasingly important parallel/multicore and distributed OS programming practices in an undergraduate-level OS course. It was used to teach CS422: Operating Systems at Yale in Spring 2010, and is freely available for use and adaptation by others. Determinator also provide the basis for the design of CertiK a CertiKOS, a certified OS kernel project in collaboration with the FLINT research group. CertiKOS aims to develop a small but highly modular operating system that has a machine-verifiable proof of its correctness and security properties. Finally, ideas explored in the Determinator project proved instrumental in GPUfs, a collaboration between UT Austin, Yale, and Technion to build a high-performance file system abstraction and API usable directly from GPU code to access file systems maintained on a conventional host operating system. In particular, GPUfs adapted many of the weak consistency models that Determinator explored and developed for different purposes. The Determinator project yielded two top-tier conference papers including a Best Paper Award at OSDI 2010, a journal paper in ACM Transactions on Computing Systems (TOCS), one completed and one forthcoming PhD thesis, and a number of workshop papers and technical reports. Further information on Determinator, PIOS, and related projects, as well as a complete project bibliography, may be found on the Determinator project home page.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Network Systems (CNS)
Type
Standard Grant (Standard)
Application #
1017206
Program Officer
M. Mimi McClure
Project Start
Project End
Budget Start
2010-08-01
Budget End
2014-07-31
Support Year
Fiscal Year
2010
Total Cost
$472,130
Indirect Cost
Name
Yale University
Department
Type
DUNS #
City
New Haven
State
CT
Country
United States
Zip Code
06520