Reports estimate that regression testing, which is the activity of retesting a software system after it has been modified, can consume up to 50% of the cost of software development and maintenance. Although there are many techniques that can reduce the cost of regression testing, most of them do not account for important characteristics of modern systems, such as product lines, web applications, service-oriented architectures, and cloud-based applications. These systems are increasingly heterogeneous: they may come from different sources, may be written in different languages, and may be accessible in different formats (e.g., source code, binary code, or through remote interfaces). Moreover, modern software is often environment dependent: its behavior can be affected not only by changes in the code, but also by changes in its complex environment (e.g., databases, configuration files, and network layouts). Because most existing regression-testing techniques do not account for these characteristics, the application of these techniques can result in inadequately tested software, problems during maintenance, and ultimately poor software quality.
The overall goal of this research is to go beyond the state of the art in regression testing by defining novel approaches that can be applied to modern, real-world software and account for its characteristics and complexity. To achieve this goal, the research will first extend analysis techniques on which regression-testing approaches rely, such as system modeling, version differencing, coverage analysis, and impact analysis. The research will then leverage these fundamental techniques to develop, evaluate with industrial partners, and make available a family of regression testing techniques and tools that can (1) build comprehensive models of heterogeneous, environment-dependent software systems, (2) evolve these models throughout the systems' lifetimes, and (3) analyze the changes across models to understand their effects on the systems' behavior and retest them effectively and efficiently. The impact of the research will be manyfold. First, the rigorous, transformative, and highly automated techniques developed will help improve the quality of today's large, complex software systems, thus benefitting all segments of society that depend on software. Second, the release of the produced tools and infrastructure will let other researchers and practitioners build on our results, advancing knowledge and understanding. Finally, the research findings will be integrated in curriculum materials that will be made available to the broader scientific community, which will help prepare a globally competitive workforce and further benefit society.