Software is an integral part of our everyday lives, and our economy relies heavily on software working correctly. However, bugs in software cause security breaches, and cost our economy billions of dollars annually. While these high costs of bugs are well known, the software industry struggles to remedy the situation because the inherent complexity of the software makes bugs so common that new bugs are typically reported faster than developers can fix them. The goal of this project is to develop a technique that fixes bugs automatically, greatly reducing the cost of fixing the bugs, improving quality of software, and reducing the negative effects on the economy and society.
Because so much software has already been written, many subroutines, data structures, and algorithm implementations already exist as part of open-source software. Therefore, for many software bugs, there already exist subroutines, data structures, and algorithm implementations in other open-source software that implement the correct behavior and can be substituted into buggy systems to fix the bugs. This project verifies two key properties necessary to build such a bug fixing technique. First, the project attempts to validate the assumption that correct code candidates actually exist in open-source software code bases. Second, the project aims to demonstrate that semantic code search techniques can effectively find these code candidates, and that the gaps between the correct and incorrect versions can be bridged using automatic techniques. Altogether, this exploratory project is intended to establish the feasibility of automated bug fixing through semantic search of open-source software. The broader impact of this work is the advancement of techniques that improve software quality, which, in turn, reduces the negative economic and societal effects of software bugs. This grant is exploratory work on an untested, but potentially transformative, research idea.