Software maintenance and evolution is a vital and resource consuming phase of the software lifecycle. Introducing software changes is a particularly complex phenomenon in case of long-lived, large-scale, and globally distributed systems. Years of research efforts have recognized three core tasks to support developers during software maintenance: feature location (a starting point of a change in source code), impact analysis (other software entities that are also change prone), and expert developer recommendations (appropriate developers to implement changes). The research will develop a novel one-stop solution for these tasks by integrating and mining the latent information cluttered in structured and unstructured software artifacts produced and constantly changed during evolution of software systems, which are largely untapped in current solutions.
This research program has three main goals: 1) Define a new integrated framework SE2 for a comprehensive analysis of software evolution, based on conceptual and evolutionary information, under a single umbrella, 2) Define new methodologies for software maintenance tasks based on SE2, and 3) Perform empirical studies to evaluate SE2 and supported methodologies. Central to our solution are the state of the art data mining, information retrieval, and program analysis methods. The research will formulate both theoretical foundations and deliver novel practical solutions to uniformly represent, analyze, and use them within the SE2 framework. Among the broader impacts the project includes production of software tools under open source licenses and collaboration with industry to transfer technology.
Software maintenance and evolution is a vital, time, and resource consuming phase of the software product lifecycle. Effectively supporting key change-management tasks is necessary to sustain high-quality evolution of large-scale software systems. To address this issue, the project formulated a novel theoretical framework and created associated software tools that analyze, integrate, and use conceptual and evolutionary information embedded in software artifacts. The conducted empirical validation studies show that the developed solution improves support for key maintenance and evolution tasks. The contributions of this investigation are a step towards answering the overarching research question of what are the exclusive and potentially synergistic benefits of integrating conceptual and evolutionary information with regards to key software maintenance tasks. The resulting work has been published in several high-quality software engineering conferences and journals (some receiving best result recognitions). A number of undergraduate and graduate students, including two minority doctoral and two non-traditional students, were trained and became contributing members on this project. Several of these students co-authored and presented papers at international conferences. Multiple graduate-level theses were derived from this project. The students graduating from this program have secured full-time employment in academia and software industry. The gained scientific knowledge was integrated in multiple undergraduate and graduate classes at the two host institutions, which broadens STEM education. ACM/IEEE Software Engineering Curriculum Guidelines identified software evolution among the ten key areas of SE education. A number of open-source software tools were developed and are made available publicly. The data repositories resulting from this project are made accessible to the scientific community and general public through the web sites of the two participating institutions. The project enhanced and strengthened a long-term professional collaboration not only between the two PIs at academic institutions, but also among the involved students. The computing infrastructure established during the course of the project permits the sustainability of its resources.