Data management systems are undergoing a sea change. Specialized engines with specialized physical storage structures are emerging to address the extreme data volume and performance requirements of modern applications. At the same time, the complexity of these systems, the applications in which they are used and the platforms on which they are built is increasing. Thus, there is a growing need for automatic database design for these emerging database systems. Unfortunately, previous work in automatic design cannot be used directly. While the conceptual framework of previous research is useful, the specifics must be reworked in order to adequately take advantage of the opportunities and address the challenges that these new structures present. Furthermore, most design tools only create complete designs. There is also a need for automatic designers that can produce a new design that is sensitive to the cost of migrating an old design to a new one. Such an incremental designer would often generate a sub-optimal design if that design will produce 80% of the benefit with 20% of the work.
The PIs propose to investigate this incremental automatic design paradigm for newly emerging database systems. Specifically, the project addresses three data management platforms: column stores for OLAP, main memory, cluster-based systems for OLTP, and an extension to row stores for exploiting correlations in data attributes.
The PIs expect to extend the work on current design tools by demonstrating workable incremental designers as described above. There is a strong need for such tools, and if this research is successful, it should enable successful deployment of many new-style data managers. Moreover, incremental design ideas based on sound cost-benefit analysis are applicable to other data-intensive computing environments and constitute an important direction towards truly autonomic computing. Further information on the project can be found on the project web page: http://database.cs.brown.edu/projects/auto/