During software maintenance, 50% to 90% of developer effort is spent on program comprehension activities, which are performed by developers to better understand source code. Reducing the effort spent by developers on these activities can reduce software maintenance costs. Researchers have developed techniques and tools to detect code clones (similar or identical segments of source code), because their presence can diminish program comprehensibility. However, knowledge only of the presence of clones does not allow a developer to perform maintenance tasks correctly and completely; proper performance of these tasks requires a thorough understanding of the relationships among the detected clones. Existing approaches for investigating these relationships are limited in their applicability and effectiveness.
The goal of this collaborative project is to develop an automated and rigorous analysis process for identifying and codifying the relationships among clones using their structural and semantic properties. To maximize the impact of the techniques and tools on the effectiveness and efficiency of performing maintenance tasks when clones are present, the investigators will perform a domain analysis. After initial development, the team will validate and refine the techniques and tools. The research will help developers to maintain software, reducing total software cost and improving overall software quality.