CI-P: Computer System Failure Data Repository to Enable Data-Driven Dependability Research

Bagchi, Saurabh; Song, Xiaohui Carol

Abstract

Dependability is a requirement for computer systems; however, research on dependable systems is hampered by a lack of real and publicly available failure data. This can result in productive paths of research being closed to most researchers and, conversely, unproductive research being performed due to faulty assumptions about the manner in which real systems fail. The goal of this project is to plan a collaborative effort to collect, curate, and provide public access to failure data for large scale computer systems through a community repository. One challenge is that failure data is considered sensitive by the owners. The ultimate goal of this project is to collect the data from some of the NSF-funded large cyberinfrastructure projects, such as NEES, LIGO, XSEDE, and NRAO.

The specific goal of this planning project is to collect requirements from potential praticipants (both users and contributors of data sets) and seed a prototype repository with data sets from two of the largest and latest clusters at Purdue. The data sets will comprise static information, dynamic information about the workloads, and failure information, for both planned and unplanned outages.

The broader impact in the project will be achieved through the dissemination of the data sets to a wide variety of researchers, and perhaps even, practitioners. The datasets will let people run their campus clusters more efficiently, i.e., with fewer failures, at higher utilization and energy efficiency.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Computer and Network Systems (CNS)
Type: Standard Grant (Standard)
Application #: 1405906
Program Officer: Marilyn McClure

Project Start
Project End
Budget Start: 2014-07-15
Budget End: 2016-06-30
Support Year
Fiscal Year: 2014
Total Cost: $65,891
Indirect Cost

CI-P: Computer System Failure Data Repository to Enable Data-Driven Dependability Research
Bagchi, Saurabh Song, Xiaohui Carol
Purdue University, West Lafayette, IN, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments