The long-term goals of this project are to bring the technology of software contracts to widely-used programming languages and, through the use of manifest contracts, to provide software developers with a migration path from simply-typed code to fully functional correctness. Since computational effects are notoriously hard to reason about and pervade even the simplest realistic programs, the proposed research should have significant impact on programmers' ability to develop software that is more reliable and more secure.

Contracts in software establish clear interfaces between program components. Like contracts in the legal realm, they delineate each party's expectations and obligations. Such contracts are becoming increasingly important for the regulation of modern software systems, providing an expressive framework for verification and error tracking. To be effective in a software environment, contracts must have formal semantics and must be supported by a monitoring system that precisely tracks the flow of values as they cross interfaces. To date, however, the formal study of contracts has mostly been limited to small idealized languages without computational effects, such as reading data from or writing data to a display or file, managing resources such as memory, and performing probabilistic or speculative computation.

This research aims to extend the semantic framework of software contracts to languages with various computational effects: the extension is qualitative in nature and will enable the use of contracts in new application domains. Specifically, the PIs propose to add support for computational effects to the two flavors of contracts studied to date: latent contracts, which are runtime checks not reflected in the type system, and manifest contracts, where a system of precise types records the most recent runtime check applied to each value. The extension of latent contracts will be done in the context of a monadic meta language. The extension of manifest contracts will make use of a variant of Hoare Type Theory to precisely record computational effects. The PIs will also implement prototype systems and use them to present novel applications of software contracts.

Project Report

Just like its business counterpart, a software contract is a binding agreement between a producer and a consumer. Simple examples of contracts, technically known as "first-order contracts," have long been integral to various programming languages and libraries and routinely used by software developers for debugging and ultimately for quality assurance. Such contracts are characterized by the fact that it is computationally trivial to check and enforce the contract. For example, checking that a submitted billing address matches the official address on record is an operation that can efficiently implemented and enforced. As data has become "bigger" and as software has become more complex, writing enforceable contracts has become a scientific subject of its own. For example, checking that a submitted tax return is valid requires a tedious calculation involving a significant amount of communication and exchange of additional information. In fact, the computation involved in enforcing the contract becomes so significant that it itself might be subject to further constraints to ensure that its effect on the underlying computation is well-understood and regulated. These concerns technically translate to a formal analysis of contracts in the presence of higher-order functions, effectful operations, large data structures, lazy languages, and even software components written in different languages. As has become apparent in the past few years,in these situations, the behavior of contract checking becomes more complex. With higher-order functions, contracts must be delayed; with large data structures, contracts must trade-off the amount of checking with the time needed for traversing the data structures; with lazy languages, contracts must be adapted so as not to force excess evaluation of delayed computations; and with multi-language systems there are few formal results. A first major result of our project is the development of a foundational approach for expressing complex contracts as communication patterns in an ambient system of communicating processes. The approach generalizes and subsumes previous semantic frameworks and is readily amenable to a distributed implementation.This system allows each contract to specify its own evaluation strategy, enforcement policy, and interactions with the underlying computation. The second major technical result is to leverage (statically checked) contracts and multi-language systems to tackle a challenging open problem in compiler verification described next. A compiler translates code written in a high-level programming language to the low-level machine code that actually runs on a processor. Compilers play a critical role in the production of our software infrastructure and as such they should be correct. Though there has been remarkable progress on formally verified compilers in recent years, almost all of these compilers suffer from a serious limitation: they are proved correct under the assumption that they will only be used to compile *whole* programs. This is an unrealistic assumption since most software systems today are comprised of *components* written in different languages compiled by different compilers to a common low-level target language. A key challenge when compiling components is that it is not clear how to even state the compiler correctness theorem. Stating such theorems is easy when compiling whole programs: if the high-level program H compiles to a low-level program L, then running both H and L should result in the same trace of observable events. The same sort of theorem does not make sense when we compile a component: we cannot "run" a component because it is not a complete program. Our project has shown how to leverage contracts and multi-language systems to verify correct compilation of *components* while formally allowing these components to be linked with low-level code of arbitrary provenance. The essential intuition is that compiler correctness should guarantee that if we compile a high-level component H to the low-level component L and link L with some other low-level component L', then the resulting low-level program should correspond to the high-level component H linked with L'. But what does it mean to link a high-level component H with a low-level component L' and what are the rules for running the resulting high-low hybrid? We formalized the latter by designing a semantics of interoperability between the high-level and low-level languages of the compiler, leveraging contracts at the boundaries between languages. This high-low interoperability semantics makes it possible to state the compiler correctness theorem (for components). We gave the first verified multi-pass compiler that allows compiled components to be linked with low-level code compiled from a different language, one that may contain features not expressible in the language being compiled. Our work lays the foundation for compiler verification that scales to multi-language software. The approach also enables researchers to explore sensible language interoperability while tacking compiler correctness.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Type
Standard Grant (Standard)
Application #
1203008
Program Officer
Anindya Banerjee
Project Start
Project End
Budget Start
2011-08-01
Budget End
2014-07-31
Support Year
Fiscal Year
2012
Total Cost
$440,760
Indirect Cost
Name
Northeastern University
Department
Type
DUNS #
City
Boston
State
MA
Country
United States
Zip Code
02115