While storage capacity and CPU processing power have experienced rapid growth in the past, improvement in data bandwidth and access times of hard disk drives (HDD) has not kept pace. As a result, the speed gap between CPUs and disk I/O is widening. Disk arrays can improve overall I/O throughput but random access latency is still very large because of mechanical operations involved. Large buffers and deep cache hierarchy can improve latency, but the access time reduction has been very limited so far because of poor data locality at the disk storage level.

This proposal aims at rethinking the fundamental architecture of storage systems and makes an attempt at a paradigm shift of disk based storage architectures. The approach is to build a new storage architecture that exploits the two emerging semi-conductor technologies: flash memory SSD (solid state disks) and GPU (graphic processing unit). The new disk I/O architecture is referred to as I-CASH: Intelligently Coupled Array of SSDs and HDDs. The SSD is used to store mostly read "reference data blocks" to make best use of its high-speed random read performance. The HDD is used to store compressed delta between a current I/O block and its corresponding reference block in the SSD so that random writes are not performed on SSD during online I/O operations. The SSD and HDD are controlled by a high speed GPU that performs similarity detection, delta derivations, combining delta with reference blocks, and other necessary functions for interfacing the storage to the host OS. The idea is to leverage fast read performance of SSDs and the high speed computation of GPUs to replace and substitute, to a great extent, the mechanical operations of HDD to achieve I/O performance that is orders of magnitude better than traditional disk storage systems. Instead of working on HDD to catch up with processors' performance, which has been proven difficult if not impossible, the proposed approach lets storage systems ride the wave of the rapid advancement of multicore processors and be part of such success by trading high speed computation for low access latency.

It is anticipated that the proposed project will have significantly broad and transformative impact. 1) Servers at data centers run tens and hundreds of virtual machines that generate large amount of I/Os that can take full advantage of our new storage architecture with potentially orders of magnitude performance improvement. 2) The research will engage both graduate and undergraduate students so that they are ready for the real world need. 3) The success of this research will help the economic development of the state of Rhode Island and the nation.

Project Start
Project End
Budget Start
2010-08-01
Budget End
2015-07-31
Support Year
Fiscal Year
2010
Total Cost
$382,931
Indirect Cost
Name
University of Rhode Island
Department
Type
DUNS #
City
Kingston
State
RI
Country
United States
Zip Code
02881