The goal of this research is to lessen the impact of the increasing speed disadvantage of memories when compared to processors. The focus in this project will be the design and evaluation of profiling techniques and hardware assists to allow better utilization and performance of the memory hierarchy. The planned efforts are to (1) improve the efficiency of on-chip caches by using better code and data placement informed by profiling; (2) determine when it is advantageous to bypass the data cache; (3) take advantage of page and burst modes of modern DRAMs; (4) investigate methods to reduce data latency by prefetching; and (5) design hardware assists at the main memory interface for compaction or partial transfer of data. These techniques will be evaluated using trace-driven simulations or profiling of binary executables for SPEC benchmarks and commercial applications.