The proposed work will develop, implement, and evaluate new techniques that help to automate performance analysis of modern software systems. The methodology pursued breaks down the problem of automating performance analysis into three components: identifying performance anomalies, detecting covariation between performance metrics, and determining causality between covariant metrics. The proposed approach uses statistical data mining and machine learning techniques to automate the three components. The proposed system works on a collection of traces, with each trace containing one or more streams of measurements from a performance metric.
The project will bring techniques from the statistical data mining and machine learning techniques to bear on the problem of automating performance analysis. The fundamental insight of the proposed approach is that there is significant information in the time-varying contours of a stream. Previous statistical approaches to performance analysis have ignored this information in favor of examining the covariation across metrics at a particular snapshot of time.