Yew Sophisticated performance measurement and simulation tools developed on the Cedar multiprocessor system during the last four years are being used to study several key architectural and compiler issues that can enhance the performance of scalable shared memory multiprocessors. These issues include memory latency reduction and hiding strategies, data synchronization requirements for loop-level parallelism, and hierarchical network design. The study of these issues involves the hardware-assisted collection of empirical data on Cedar and the use of simulation. The information thus obtained could lead to the design of next- generation systems that, compared to present-day systems, provide higher sustained performance across a broader range of applications.