The CirrusDB project seeks to address several key challenges that arise when building a "database as a service". The goal of such a service is to provide a SQL interface to many applications, storing data that today might be spread across hundreds or thousands of separate database management systems. This is attractive because the costs incurred by individual users in the form of software licensing, hardware, management, and energy can be substantially lowered because it will be possible to multiplex many different databases onto a smaller footprint. Moreover, the costs can be made proportional to actual usage, saving significant up-front investments by application developers.

The intellectual merit of CirrusDB relates to three key challenges in this area:

1) Multi-tenancy, in which the resource use of a complex set of database workloads are monitored and the databases are consolidated on to a minimum number of physical machines while ensuring performance isolation and live migration with no downtime. Unlike existing work on large-scale multi-tenancy, which has focused on co-locating tenants with similar schemas, CirrusDB attempts to co-locate tenants which have resource utilization profiles that will not exceed the capacity of the machine on which they are hosted.

2) Scalability in which the responsibility for query processing (and the corresponding data) is partitioned amongst multiple nodes in the service to achieve high throughput, using a novel graph partitioning strategy.

3) In which the DBaaS infrastructure executes SQL queries issued by applications over encrypted data, enabling complete SQL processing over fully private data. The key idea is to use the notion of "adjustable security" by encrypting the data in layers in a way that allows not just equality checks but also range queries, sorting operations, and joins to be executed efficiently.

CirrusDB will have broad impact in that it will result in database solutions that will a enable large, multi-node database service to be deployed both on a public cloud as well as in private data centers, providing multi-tenancy, scale-out using automatic partitioning, and privacy for SQL query execution. This will result in significantly lower administrative and capital costs required to run large-scale databases. CirrusDB will also reduce the barrier to entry for applications that use databases, because it will not be necessary to have in-house database expertise even when one needs high transactional performance.

For further information see the project web site at the URL: http://db.csail.mit.edu/cirrusdb/

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
1065219
Program Officer
nan zhang
Project Start
Project End
Budget Start
2011-04-01
Budget End
2016-03-31
Support Year
Fiscal Year
2010
Total Cost
$1,200,000
Indirect Cost
Name
Massachusetts Institute of Technology
Department
Type
DUNS #
City
Cambridge
State
MA
Country
United States
Zip Code
02139