Multi-program execution, the concurrent execution of multiple independent applications, will play an important role in efficiently exploiting the potential of future multi-core systems. Researchers use multi-program workloads to evaluate proposed designs and policies for various aspects of multi-program execution. Unfortunately, the fixed workload and variable workload multi-program workload methodologies used today are unsound and lead to incorrect results. Therefore, the proposed research is aimed at investigating a new multi-program workload construction scheme, called FIESTA (Fixed Instruction with Equal STAndalone runtimes), in which application samples are chosen so that individual application have equal runtimes when executing alone. Samples are then mixed and matched to form multi-program workloads, but the same samples are used in every experiment. The research will investigate two issues related to FIESTA: Generation of application samples and Extension to multi-threaded environments. FIESTA workloads should produce results that are internally consistent and plausible.

Computer architecture research surges when new tools, benchmarks, and methodologies are introduced and distributed. The quality and depth of single-program experimental evaluation improved when efficient sampling and simulation techniques like SimPoints were introduced. FIESTA should provide the same impetus for multi-program execution research and education, while becoming the standard methodology for sampling both single-threaded and multi-threaded programs.

Project Report

This project focuses on how to experimentally evaluate computer systems. When a system has multiple processor cores, it often runs what is known as a "multiprogrammed workload", in which each core runs a different software program. Because multiprogrammed workloads are typical use cases for systems, computer system designers seek to develop software benchmarks that are representative of how typical multiprogrammed workloads behave. The challenge is that, unlike with single-program benchmarks where a single benchmark program simply runs on just a single core, many new issues arise when creating multiprogrammed benchmarks. Consider even the simple case of two benchmark programs, A and B, where A runs longer than B. Do we consider the performance of the system once B has finished and A is still running? More complicated issues also arise for multiprogrammed workloads in which a large number of programs are run; for example, if there are more programs than cores, how do we choose which program to assign to a newly idle core? Furthermore, all of these problems are exacerbated when we consider sampling of benchmark programs. There are vastly more variables to consider when constructing a multiprogrammed benchmark, and we must do so in a way that enables fair comparisons between different systems. The original goal of this project was to develop a methodology that overcomes what is known as "load imbalance," i.e., the situation in which benchmark programs run for different lengths of time. In exploring this issue, we discovered that the problem was far broader. Our discoveries led us to eventually deteremine that the more important problem to solve was how to precisely and unambiguously specify multiprogrammed benchmarks that enable fair comparisons across systems. We developed a methodology for specifying multiprogrammed benchmarks, and we further classified multiprogrammed benchmark classes based on what kind of use model they correspond to (e.g., datacenter or desktop).

Project Start
Project End
Budget Start
2012-01-01
Budget End
2013-12-31
Support Year
Fiscal Year
2012
Total Cost
$134,472
Indirect Cost
Name
Duke University
Department
Type
DUNS #
City
Durham
State
NC
Country
United States
Zip Code
27705