Software quality has been an important but illusive concept for several decades, with experts different, sometimes conflicting, guidelines. Using an ecosystem of about 20,000 open source Java projects, this study will to try to discover correlations between open source component utilization and software quality metrics, in order to provide the strongest empirical evidence yet as to how the several metrics pertaining to software quality correlate with actual utilization of reusable components. This study will provide a scientific basis to some of the existing guidelines and to the dismissal of some others. If no correlations are found, this result will disrupt current conceptualizations of component quality, forcing researchers and developers to reassess their understanding of software quality and reusable components. Software being a foundation of modern society, and Open Source development being a significant movement in society at large, it is critical to gain a deeper understanding of software quality on a global scale, leading to the development of innovative tools and methods.
The metrics used in this study are those defined by the SQO-OSS Quality Model. To understand correlations, the following method will be used. First the dependency graph will be built, capturing software dependencies at the global scale. This requires overcoming technical challenges in cleaning up and clustering the data, as real world projects contain all sorts of idiosyncrasies related to the use of external components. Second, a suite of utilization metrics will be developed using this global dependency graph that capture the depth and breadth of component usage by these projects. Finally, the SQO-OSS Quality metrics for a significant subset of projects in the data set will be computed and compared with the projects' utilization metrics in order to reveal the correlations.