This award is funded under the American Recovery and Reinvestment Act of 2009 (Public Law 111-5).

The project investigates statistical software analysis, which infers relationships among program components by using statistical properties derived from multiple program executions.

To motivate statistical techniques, it is useful to draw analogies to static analysis methods. Static analysis is about inferring dependencies between program components: If a value is changed in one component, how does that affect a value in a different component? Static analysis tends to work best for properties that are local, meaning the pieces of the program we are trying to relate are not separated by a great deal of other computation. The statistical analog of dependencies is correlation. Instead of proving definitively via static reasoning the presence or absence of dependencies, we can observe at run-time that some properties of two components have high or low correlation. Importantly, correlation is not affected by syntactic or even dynamic locality: if two components have a correlation, regardless of how much time or computation passes between the execution of one component and the execution of the other, this correlation can be detected if the appropriate statistical question is asked.

The initial focus is on using cross-correlation, which which computes the maximum correlation between two sequences of observations, to formalize statistical correlation between software components that have a direction in time. This idea gives rise to a natural graph that captures the strength and direction of statistical influence one component has upon another; these graphs are analogous to traditional dependency graphs, but have unique and useful properties.

Project Start
Project End
Budget Start
2009-08-01
Budget End
2013-07-31
Support Year
Fiscal Year
2009
Total Cost
$499,999
Indirect Cost
Name
Stanford University
Department
Type
DUNS #
City
Palo Alto
State
CA
Country
United States
Zip Code
94304