B-trees have been the data structures of choice for external-memory searching for decades because they minimize the number of disk-block accesses performed during a search. It is well known, however, that B-trees are empirically suboptimal because they exploit data locality at only one level of granularity, typically disk blocks, but not at coarser granularities, such as disk tracks, or finer granularities, such as cache lines.

Theoretical developments on cache-oblivious data structures and algorithms have shown how to achieve nearly optimal locality of reference simultaneously at every granularity. A striking feature of cache-oblivious data structures is that they free the programmer from the burden of tuning the code for cache and disk effects. The PIs' recent experiments suggest that cache-oblivious B-trees (CO B-trees) can surpass the performance of highly tuned traditional B-trees. CO B-trees achieve superior performance because they approximately optimize for all memory effects. In contrast, cache-aware algorithms ignore important aspects of the memory hierarchy. CO B-trees are not yet ready to be used in file systems and data bases, however, because they lack essential capabilities of industrial-strength B-trees, such as support for variable-size keys, concurrent accesses, and transactions.

The researchers propose to investigate how CO B-trees can achieve their potential. The researchers plan to study the wide range of algorithmic problems in data structures, stringology, and distributed systems required to develop a full-featured CO B-tree. In addition, the researchers plan to solve online scheduling problems so that virtual-memory systems can provide efficient support for cache-obliviousness. This algorithmic work is necessary to transfer CO technology to other areas of computer science, engineering, and scientific computing and is intended to transform how scientists and engineers manipulate massive data sets.

Project Start
Project End
Budget Start
2006-08-01
Budget End
2010-07-31
Support Year
Fiscal Year
2005
Total Cost
$150,000
Indirect Cost
Name
Rutgers University
Department
Type
DUNS #
City
New Brunswick
State
NJ
Country
United States
Zip Code
08901