Improving software quality is a central priority of our time. Approaches to understand, diagnose, localize, and fix software defects are usually empirically grounded in datasets of past defects/repairs. A few such datasets are available; however, these datasets are difficult to create, and are typically not of the size, scale, and diversity that is representative of the software in use. The goal of this project is to create BugSwarm, a large-scale repository of replicable defects, tests, and patches. We propose to draw from the recorded history of defects in open-source projects, to create this dataset of unprecedented size and diversity, while retaining sufficient fidelity of detail to allow careful study and replication of these historical defects. BugSwarm will amplify the size of available defect datasets by several orders of magnitude.
Continuous integration (CI) development practices, where build and test processes are carried out in the cloud, with archived results, offer a novel opportunity to construct defect datasets. We propose to exploit CI coding practices to create BugSwarm. Continuous, online integration practices inherently create archived records of build and test attempts, including those that result in build and/or test failures, and subsequent repairs. These practices emerged from the imperatives to: a) rigorously and automatically build, integrate, and fully test numerous code submissions from volunteer developers; and b) the need to test these submissions at large volumes, in virtualized, configurable, cloud-based settings. The virtualized, cloud-based testing makes these defects much more available and replicable. We will exploit these archived records to create DRPs (Defect Replication Packages) and DFRPs (Defect & Fix Replication Packages), comprising buggy versions of the code, failing regression tests, and bug fixes. DRPs and DFRPs will include complete virtual machines to reproduce real test failures. BugSwarm will facilitate experimentation, and avoid the duplication of a tremendous amount of work among researchers in programming languages, and software engineering.