The goal of the Semantic Web is to free Web data from the applications that control them, so that data can be easily described and exchanged. This is accomplished by supplementing natural language and other data found on the Web with machine readable metadata in statement form (e.g., X is-a person, X has-name Joe, X has-age 35) and enabling descriptions of data ontologies so that data from different applications can be integrated through ontology mapping. One element of this vision is to turn the Web into a giant database, against which one can issue structured queries and receive structured answers in response.
The SW-Store project is undertaking the clean-slate design of a DBMS specifically architected for this type of Web metadata and the prevalent Semantic Web data model, the Resource Description Framework, or RDF. The management of Semantic Web data presents many difficult challenges. The size of the data is growing rapidly, and in theory could reach the scale of the Web. The types of queries vary greatly in complexity, ranging from keyword search to complicated parameterized subgraph matching. Data integration, inference, and reasoning must be primitive operations that can operate at scale without human intervention. A data management system must not only be a place where data is stored and from which data is accessed; it must use the machine-readable semantics of the data to develop higher level models and help guide a user through the mass of information. In sum, a data management system for the Semantic Web will look very different from a standard, transactional, relational database system.
The SW-store project researches the architecture of such a system. This research is inherently interdisciplinary, bringing in ideas from the data management, Semantic Web, and artificial intelligence communities. The project involves experimenting with partitioning schemes, where data is allocated to different nodes on a shared-nothing cluster so that queries can be run in parallel across multiple machines. It also involves exploring how ontology reasoning can be integrated inside the database system so that it can benefit from the near limitless scalability a shared-nothing cluster can offer. SW-Store further investigates providing iterative query interfaces and integrating complex queries with text search. Finally, the project involves studying the design of the storage layer for a Semantic Web data management system, looking at how data should be laid out, updates should be performed, and what materialized views to create.
Further information about the project can be found at the project Webpage: http://db.cs.yale.edu/swstore/