As the cornerstone of the global information technology infrastructure, large-scale parallel processing platforms host an ever-increasing amount of real-time in-memory computing applications (e.g., data/graph analytics, transaction processing, machine learning, and business intelligence). As a result, large-scale distributed in-memory data storage has become a very important component in large-scale parallel processing platforms. Nevertheless, conventional realization of in-memory data storage tends to occupy a large amount of memory capacity and consume substantial CPU cycles. This makes in-memory data storage subject to a significant cost overhead in terms of both memory and CPU resources. How well this cost challenge can be addressed largely determines the overall system performance and efficiency of future large-scale parallel processing platforms. This project will have significant impact on the research community and the industry, while providing interdisciplinary training of graduate and undergraduate students, and draw broad participation of students of different levels and backgrounds in collaborative research and education.
This project proposes to improve the cost effectiveness of in-memory data storage by enhancing the function and data processing capability of the hardware memory controller. In particular, the in-memory filesystem and memory controller will explicitly cooperate together across the software/hardware layers and share the responsibility for optimizing the implementation of in-memory data storage. Such a cross-layer design framework enables the use of memory footprint reduction techniques to reduce memory resource cost without incurring CPU overhead. Moreover, the memory controller will integrate customized hardware engines that can carry out certain storage-oriented data processing tasks. Those customized storage data processing engines in the memory controller can be leveraged to directly reduce the memory and CPU resources overhead in the realization of in-memory data storage. An FPGA-based platform will be implemented to carry out experiments to further empirically validate the feasibility and potential effectiveness of the developed design solutions.