The goal of this research project is to make advances in the management of "semistructured" data: data that may be irregular or incomplete, or whose structure may change rapidly or unpredictably. Semistructured data is prevalent today, particularly through the World-Wide-Web and when integrating information from heterogeneous sources. The project is developing a complete database management system (DBMS) for semistructured data, which enables such data to be stored, queried, and updated easily and efficiently. The system, called "Lore", includes all capabilities of a traditional DBMS, with particular research emphasis on the new challenges that semistructured data brings to the areas of object storage and clustering, indexing, cost-based query optimization, trigger processing, and user interaction models. The project also considers benchmarking and scalability issues for semistructured data, integrating externally obtained data with Lore data during query processing, and smoothly combining structured data, semistructured data, and Information Retrieval style text search. Because the data model used by Lore is nearly identical to XML -- an emerging standard for data representation on the World-Wide Web -- Lore is well-positioned as a DBMS for storing and querying Web data more effectively than other research or commercial system available today. Results from the project will be disseminated as research papers and as a freely-available prototype software system.
http://www-db.stanford.edu/lore