Software developers have difficulty, and therefore waste a significant amount of time, navigating around the code base as they perform software maintenance activities. Thus, a major challenge that confronts software engineering is to understand how developers seek relevant code and further invent tools that facilitate code navigation. Previous research has resulted in some ad hoc tools without any theoretical basis, along with some descriptive theories derived by observing developer behavior. These approaches have produced only modest improvements. A more comprehensive and effective solution can only be reached through a theoretical understanding of the code navigation task itself. This research seeks a unified theory for code navigation based on mathematically modeled foraging mechanisms that evolved to help our animal ancestors to find food. These same mechanisms appear to work as users seek useful information in the vastness of the Web. Developers, like organisms foraging for food, need to evolve the strategies to maximize the gains of useful information to their maintenance tasks per unit cost. The research will explore the usefulness of this analogy and adopt theories, models and tools that could eventually help lower the cost and effort of software development and maintenance.

This EAGER proposal focuses on an exploration of the extent to which the code foraging environment can be automatically enriched by software clustering. The hypothesis guiding this research is that the way programming artifacts are grouped (clustered) can affect the profitability of the information foraging environment, which in turn can shape the way developers navigate the code base. In order to test the hypothesis, the PI will (i) develop an experimental framework to assess topical locality in software, (ii) create a new set of metrics to guide the evaluation of code rearrangement and foraging tools, and (iii) compare well-established source code clustering algorithms to discover effective enrichment mechanisms. The research will be evaluated through experiments with large open-source projects that have recorded detailed developer interaction logs.

Project Report

', are threefold: (1) test the extent to which optimal foraging theory's tenets hold in developer's code navigation, (2) test the extent to which clustering enriches the software environment, and (3) discover principled ways to enrich the software environment to improve developer's code navigation. The project is of interdisciplinary nature that synthesizes principles of evolutionary biology and ecological psychology and applies those principles in software engineering. The intellectual merit of the project lies in the formulation of the novel hypothesis that clustering can help best shape the information foraging environment to software developers, as well as the systematic studies carried out to test the hypothesis. The main results relating to the major goals are that: (1) the basic foraging theory's tenet of 'topical locality' holds in object-oriented software, (2) the cluster hypothesis holds in the context of automated tracing, namely, correct and incorrect traceability links can be grouped and recognized in high-quality and low-quality clusters respectively, and (3) explicit semantic relatedness methods can be used to enrich software environment for developer's code comprehension and navigation in a principled way. The broader impacts of the project are achieved by the publication and dissemination of peer-reviewed papers (most notably an ICSE'13 paper), the engagement of a female undergraduate student in software engineering research, the experiences gained of developing interdisciplinary research tasks (e.g., human and social aspects in software engineering) to attract students from underrepresented groups, and the training future software engineering researchers to carry out high-quality work -- specifically, the graduate student (Anas Mahmoud) supported by the EAGER grant not only successfully defended his Ph.D. dissertation in March 2014, but also received and accepted an offer of joining Louisiana State University as a tenure-track Assistant Professor starting in August 2014.

Project Start
Project End
Budget Start
2012-05-01
Budget End
2014-04-30
Support Year
Fiscal Year
2012
Total Cost
$80,000
Indirect Cost
Name
Mississippi State University
Department
Type
DUNS #
City
Mississippi State
State
MS
Country
United States
Zip Code
39762