This research involves the experimental analysis of delayed consistency in large-scale cache-based multiprocessors including its effects on false sharing and latency tolerance. False sharing refers to the read/write sharing of cache blocks in the absence of data sharing in a parallel computation. Latency tolerance refers to the reduction of the average latency of shared-memory accesses by overlapping them with computation. Delayed consistency takes advantage of weak ordering in cache-based systems. The sending of cache invalidation can be delayed, in which case they can be overlapped with processor accesses to the cache. Additionally, both the sending and receiving of invalidations can be delayed, in which case false sharing effects are reduced by increasing the time during which cached blocks remain accessible by each local processor. The quantitative effects of delaying consistency and its variants will be evaluated through execution-driven simulation of parallel benchmark programs, including nine parallel numerical as well as non-numerical algorithms and thirteen Fortran programs contained in the Perfect Club Benchmark suite, parallelized using an Alliant 2800 compiler. Emphasis will be on the effects of block size, cache size, and granularity of parallelism. Statistics will be collected not only on miss rates, but also on memory traffic, on memory access latencies and on total execution times.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Type
Standard Grant (Standard)
Application #
9115725
Program Officer
Yechezkel Zalcstein
Project Start
Project End
Budget Start
1992-07-15
Budget End
1994-06-30
Support Year
Fiscal Year
1991
Total Cost
$136,545
Indirect Cost
Name
University of Southern California
Department
Type
DUNS #
City
Los Angeles
State
CA
Country
United States
Zip Code
90089