As the computing industry steps into the many-core regime, the main memory system has become one of the key bottlenecks that limits both performance and scalability. To address these challenges, the memory industry is developing 3D-stacked DRAM technology. Die stacking can provide lower latency, much higher bandwidth, and significantly reduced energy dissipation. Unfortunately, stacked memory is unlikely to have sufficient capacity to completely replace traditional DRAM. Therefore, future memory systems will likely use stacked memory in combination with off-chip DRAM, either architecting stacked DRAM as a giga-scale cache or as heterogeneous main memory. However, to fully utilize the potential of stacked memory, the system architecture must make choices that exploit the unique latency and bandwidth characteristics offered by stacked DRAM. For example, simply applying traditional "well-understood" cache designs and memory designs to stacked DRAM results in low performance and poor bandwidth utilization.
This project first looks at caching organizations and management strategies for stacked DRAM that are tailored to exploit latency and bandwidth properties of 3D stacking. It then looks at memory organizations that can incorporate stacked memory as part of memory address space, without relying on OS support to exploit temporal locality and still perform memory management at fine granularity. Finally, this project investigates morphable architectures that can dynamically reconfigure the stacked DRAM between cache structure and main memory, in order to conserve power and optimize performance depending on the workload requirements. The research solutions in this study will thus help future systems obtain an order of magnitude improvement in both bandwidth and energy-efficiency from the effective use of memory stacking.