As the volume of data being processed by today’s systems continues to increase, the traditional organization of memory systems is shifting to accommodate that accelerating growth. Data-centric applications such as irregular graph-mining algorithms, distributed machine learning, and genome sequencing require a large amount of data to compute and store, and generate massive amounts of intermediate data to move around the compute resources. Memory-centric computing is a potential solution to overcome the performance bottleneck of current systems. Near or in-memory computing can mitigate the bandwidth limitations with fewer data movements between the memory and host processing units; a remote memory pool with a fast interconnect shared by all processing units can overcome the current capacity constraints. Both solutions are promising for breaking down the memory wall. However, it is challenging to release the power of both solutions with direct integration.  In this project, the investigators propose an integrated, full-stack system to enable memory-centric computing (SMC2). The system will incorporate the emerging near-memory data processors (NDP) and an extensible remote memory pool to minimize the performance impact of memory accesses in graph-mining applications. The research tasks include optimizations in architecture, the software/hardware interface, programming models/compilers, and performance models/optimization. First, the architecture is revisited to utilize the NDP hardware to build an active memory system that supports intelligent data prefetch and speculative data push. Next, the system software is redesigned to support NDP function calls, data-push operations, and virtualization. Then, with new system abstractions, a new programming model is proposed to allow programmers to specify which tasks can run on the NDP resources, and to support efficient NDP-to-NDP communication. Lastly, a new system performance model and optimization framework are incorporated. By putting the four pieces together, the proposed system support can maximize the performance of memory-centric computing with new system abstractions and theories.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.