Many data-intensive systems, such as commercial or scientific database systems, store vast amounts of data. Some metrics (e.g., performance) of answering user queries on stored data can be improved by using derived data, such as indexes or materialized views. The goal of this project is to develop an extensible framework for designing and using derived data in answering database queries efficiently. The outcomes of the project are expected to be general and independent of a specific data model (e.g., relational or XML), while giving guarantees with respect to query-performance improvement.
The approach consists of developing and evaluating mathematical models and algorithms for designing and using views and indexes for common types of queries on relational and XML data. The techniques will be experimentally evaluated using an open-source implementation of a database-management systems, on synthetic and real relational and XML databases. Expected outcomes of the project include automated tuning of data-access characteristics in a variety of applications, thus enhancing the quality of user interactions with data-intensive systems.
The PI will develop and disseminate practical solutions to the problem. Expected outcomes of the project include sequences of research-oriented exercises that would increase novelty and excitement in the curriculum and enhance learning experience for students. By working on the proposed project, the diverse body of students at NCSU will obtain a unique set of skills that will position them competitively in the modern workplace.
The PI will make the results of the project widely available, including software she will develop.