This project is building tools and developing methods that identify the root cause of software configuration problems and suggest potential corrective actions. Our work is motivated by the increasing complexity of modern software, which makes computer systems difficult to configure and manage correctly. Users and administrators currently spend considerable time and effort troubleshooting software configuration problems. For instance, technical support is estimated to contribute 17% of the total cost of ownership for desktop computers and 60-80% for information systems.
We are demonstrating how system support for causality tracking can substantially reduce the time and human effort needed to troubleshoot software. We are focusing on configuration problems, in which the application code is correct, but the software has been installed, configured, or updated incorrectly so that it does not behave as desired. We are developing methods and tools that automate troubleshooting, thereby reducing the time to recover from errors and requiring less manual effort. Our tools track causality within software binaries by using dynamic instrumentation to monitor information flow at byte granularity. They propagate this information among files, processes, and multiple computers to troubleshoot complex distributed systems. Multi-level causality tracking helps determine the set of configuration values and other inputs that are most likely to have influenced the control flow of misconfigured software programs. We expect that the tools developed during this project will make complex computer systems easier to manage; this has the potential to dramatically reduce administrative support costs for our nation's computer infrastructure.