Building and evolving large software systems is one of the biggest challenges facing software engineers today. One approach to dealing with the complexity of a software system is to study its software architecture, the high-level organization of a software system. This research is developing lightweight techniques for modeling software architecture and for enforcing architecture in code.

The core approach is a new formal modeling framework for software architectures that can express arbitrary dynamism in the design, yet also abstraction and composition mechanisms that allow the designs to scale. The framework is grounded in formal semantics that define how dynamic architectural models can be simulated and analyzed.

In order to increase the impact of software architecture in practice, the project is developing new techniques for mapping from arbitrary object-oriented implementation code to a high-level architectural design, using unobtrusive program annotations sprinkled throughout the source code. These annotations are used by novel program analysis techniques to verify that the code conforms to the structure of the architectural design, ensuring that engineers achieve the benefits of their architectural design in practice. The project is validating these techniques through case studies, and is achieving impact through open source software, commercialization, and educational outreach.

Project Report

Building and evolving large software systems is the biggest challenge facing software engineers today. One approach to dealing with the complexity of a software system is to study its software architecture, the high-level organization of a software system (analogous to the architecture of a building, which represents its high-level structure). Software architecture has proven to be very useful for studying possible designs for software systems. Unfortunately, however, before this grant it was very difficult to relate that design analysis to the source code of the system. As a result, the techniques of software architecture could not be leveraged as effectively in practice as in theory. In this grant, we developed a new set of techniques for capturing the architectural design of a software system within its source code. The core technique was to put annotations representing architectural information into the source code. These annotations are based on the concept of "ownership" and show what are the high-level objects in the system, and what lower-level data structures are part of these high-level objects. A program then analyzes these annotations, checks them for consistency with each other and with the source code, and extracts an architectural design. For example, the Secure Information Flow Case Study picture attached to this report shows an architecture extracted from source code using a tool developed as part of this research project. The architecture shows all possible paths of (explicit) information flow in the system. Checks and crosses show where the information flow mirrors the designer’s intent, or not. Thus, it is easy for software engineers to look at this generated diagram and assess to what degree their source code matches their intended design with respect to structure and information flow--thus addressing the problem we set out to solve. The scientific contributions of this grant include the development of the techniques above, including their precise specification, mathematical theorems about their correctness, and an implementation and evaluation of the techniques on real software. The results of our empirical evaluations show the techniques can apply to a wide range of software, have low costs, and have benefits for program understanding, for improving software designs, and for finding security problems in code. We expect the eventual impact of these results will be helping software engineers to build software that is more reliable and secure, and to do at lower cost. Ultimately, as these scientific contributions are refined and commercialized (something we continue to pursue), this research will benefit every user of software, and positively impact the competitiveness of the United States software industry. The impact of this work also extended to education, in several respects. We released or maintained 2 open-source tools, SASyLF (www.sasylf.org), and Plural (http://code.google.com/p/pluralism/) which have been used in education in addition to research. 7 graduate students and 11 undergraduate students were trained to do research using the results of this grant. The PI actively sought out women and other underrepresented students for training in research, and was able to support 3 such students in this grant. In addition, 150 or more students have been impacted so far by the techniques and tools developed in this grant and integrated into the classroom; many more will likely see that impact in the future. As a CAREER grant, this work also stimulated and supported new ideas in the area of reasoning about protocols in code (Plural), formal reasoning about programming languages (SASyLF), and other areas of programming languages and software engineering. The impact of these ideas may grow in the future through other work supported by the NSF and other entities.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
0546550
Program Officer
Sol J. Greenspan
Project Start
Project End
Budget Start
2006-02-01
Budget End
2011-01-31
Support Year
Fiscal Year
2005
Total Cost
$490,000
Indirect Cost
Name
Carnegie-Mellon University
Department
Type
DUNS #
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213