This proposal describes an approach to performing multi-resolution system behavior monitoring at runtime and generating models based on this monitoring in dedicated distributed and parallel systems. The research offered focuses on dedicated systems, particularly clusters performing a well-known set of tasks that occur with relatively predictable frequency. Anomaly detection can be accomplished through comparison of the observed runtime system behavior against a high-fidelity model of the system. The following three specific sub-areas will be emphasized:
Multi-resolution Behavior Monitoring Data Fusion Flexible Behavior Modeling
The intellectual merit of this work is to further understanding of anomaly detection within cooperating processors working over high speed networks in such a way that the resulting overhead is acceptable. The proposal also includes a continuing industrial collaboration, as well as the completion and initiation of PhD dissertations and involvement of historically underrepresented groups in a summer research program.