Running experiments to measure the performance of a system and comparing its performance to other works is a routine procedure of computer science research. However, an experimenter usually has a natural tendency to choose the experimental setting in favor of his/her own work. This may introduce unfairness into the comparison. Instead of trying to eliminate such bias, which is difficult because human effort is unavoidable, this project tries to clarify such bias to readers by visualizing the experimental results so that a reader can understand the true intention of the experimenter.

To achieve this goal, this project observes that an experimenter usually introduces bias by tuning the bottleneck of his/her experiment. This project targets developing comprehensive, automatic, and efficient approaches to identify experiment bottlenecks and concise and informative approaches to present such bottlenecks to readers. To be concrete, this project proposes to use a wait-for graph and an interpretive cumulative distribution function (CDF) to identify and present bottlenecks related to throughput and latency respectively, which are the most widely used performance metrics. It will further investigate how to scale these mechanisms to large and long-running systems.

If successful, this project has the potential for significant real-world impact, because first, performance analysis is a routine procedure in both industry and academia, and second, clarifying experimenter bias would allow our community to better evaluate a research work. This project will further provide materials to educate students in a variety of programs, including undergraduate and graduate operating systems courses in the Department of Computer Science and Engineering, new projects in the Department of Analytics, and new projects at the Ohio Hackathon (targeting undergraduate and high school students).

All versions of data, including program source code, papers, trace data, and other documentation will be maintained in a data repository at the Ohio State University using a versioning infrastructure, with periodic back up. The update of all data will be published at http://web.cse.ohio-state.edu/~yangwang/experimenter_bias.html. All data will be stored and maintained for at least five years after the completion of the project.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Network Systems (CNS)
Type
Standard Grant (Standard)
Application #
1908020
Program Officer
Matt Mutka
Project Start
Project End
Budget Start
2019-07-01
Budget End
2022-06-30
Support Year
Fiscal Year
2019
Total Cost
$496,893
Indirect Cost
Name
Ohio State University
Department
Type
DUNS #
City
Columbus
State
OH
Country
United States
Zip Code
43210