From key-value stores to distributed file systems to distributed databases, networked storage underpins modern Internet services. Networked storage allows programmers to separate logic and data, enables high throughput scale out, and takes advantage of increasingly fast datacenter networks. However, in the big data era, networked storage faces a new challenge: The amount of data accessed per user request is growing rapidly; outpacing processor speeds and DRAM capacity. Increasingly, user-perceived response times are dominated by the slowest storage accesses, i.e., the 99th percentile tails. Networked storage is notorious for fat tailed response times.

We are developing networked storage systems that are 1) always fast and 2) cost efficient. A key approach is to understand and selectively use replication for predictability (or cloning). In this approach, clients issue redundant storage accesses against independent hardware resources. The first to respond provides the result. Replication for predictability reduces client-perceived variability, leading to always fast response times. Our implementations are especially cost effective at scale. To lower costs, we study the root causes of slow response times, saving resources by focusing on common causes. We also trade quality---e.g., slightly degraded search engine results--- for lower hardware costs when appropriate. For broader impact, the PI will work to transfer the technology to national and local companies.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Network Systems (CNS)
Type
Standard Grant (Standard)
Application #
1320071
Program Officer
M. Mimi McClure
Project Start
Project End
Budget Start
2013-10-01
Budget End
2017-09-30
Support Year
Fiscal Year
2013
Total Cost
$400,000
Indirect Cost
Name
Ohio State University
Department
Type
DUNS #
City
Columbus
State
OH
Country
United States
Zip Code
43210