The past decade has seen an explosion of interest in so-called "dynamic" or "scripting" programming languages, which emphasize programmer productivity at the possible expense of run-time performance. Among other applications, scripting languages are central to much web-based and mobile computing. With increasing use, and with the proliferation of multicore processors, there will be inevitable pressure to improve the performance of these languages through parallel execution. Unfortunately, the state of the art in parallel scripting has not kept pace with parallel architecture developments. While many scripting languages support asynchronous handling of events from the external world, programmer-centric features, like dynamic typing, make it very difficult for event handlers and the main program -- or multiple aspects of the main program -- to execute efficiently and simultaneously on separate cores of a modern processor.
The goal of this project is to build a detailed understanding of the tradeoffs between scripting language design and the performance of parallel execution. This goal is accomplished through two main tasks: (1) minimizing the overhead necessary to safeguard the language implementation in the face of parallel execution, and (2) quantifying the marginal cost of different models of data sharing and memory consistency. The bulk of the work takes place in the Ruby scripting language, widely used for Internet server development. This approach will leverage recent developments in transactional memory, read-mostly synchronization, and high-performance data structures.
This project will contribute directly to the training of students at both the graduate and undergraduate level, and to curricula for courses at both the advanced and introductory level. More broadly, effective support for parallelism in mainstream scripting languages can be expected to produce significant improvements in productivity across the full range of computer applications, in government, industry, science, the arts, and entertainment.
Modern computers solve problems quickly by arranging for multiple processors ("cores") to collaborate on the solution in parallel. Writing programs for these computers is hard, for many of the same reasons that it is hard to coordinate the activities of human beings collaborating in the physical world. The difficulty is exacerbated in computing by the fact that, until recently, programming languages for parallelism were designed for use only by experts, and only for the most complex and demanding problems. The emphasis has been on making the complex cases possible, not on making the simple cases easy. Work under this grant has taken important steps to improve this situation. It began with the assertion that parallel programs will be easier to write and understand if their behavior is deterministic -- if output depends only on input, and not on the detailed interleaving of the actions of the cores. Drawing on experience from multiple research groups, we were able to formally characterize the meaning of "deterministic" -- to show that there are multiple possible definitions, and to explain the impact of those definitions on ease of programming, debugging, and performance. Building on this formal foundation, we developed a deterministic parallel dialect of the widely used Ruby programming language. Ruby is representative of the "scripting" family of languages, in which ease of programming is of paramount performance, and parallelism has, historically, been particularly challenging. Using our language and implementation, we demonstrated that the "simple case" in parallel programming -- in essence, the identification of "mutually independent" subcomputations -- can indeed be easy to express. Moreover, independence, which is needed for correctness, can be verified automatically and efficiently. Experience with our Deterministic Parallel Ruby (DPR) suggests that it can be a powerful tool not only for commercial development, but for the teaching of parallel programming as well.