An important class of practical parallel computers consists of machines comprising many large grained processors. These processors execute asynchronously and memory is assumed to be distributed. Trying to execute a general n-thread parallel program P on such a parallel system poses theoretical and practical challenges. One must provide mechanisms for implementing the synchronization barriers, for load sharing and balancing amongst the processors, and for program executing that respects memory locality. The approach used in this research is to define algorithms and a program transformation (compilation) T so that T(P) executes correctly and efficiently on any P. This simulated execution does not require any specialized hardware or operating system mechanisms in the target machine. Of particular interest is the case that the parallel program is large-grained, i.e. that in each parallel step a substantial block of instructions is executed on some or most of the parallel threads. It is assumed that the parallel program is already optimized so as to make the blocks of code within the parallel step quite large. This project studies the most general case of large grained parallel programs, paying particular attention to memory locality and to memory fault-tolerance, employing randomization. A further goal is to achieve Las Vegas style results, i.e. small expected running time and no errors.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
9313775
Program Officer
Dana May Latch
Project Start
Project End
Budget Start
1994-05-01
Budget End
1997-04-30
Support Year
Fiscal Year
1993
Total Cost
$320,174
Indirect Cost
Name
Harvard University
Department
Type
DUNS #
City
Cambridge
State
MA
Country
United States
Zip Code
02138