Often today, the tools used to design a small personal database are as complex as the tools used to design enterprise-size information systems. This challenges the human ability to deal effectively with the complex structures employed in these tools. In addition, as the world gets increasingly networked, we find it possible and desirable to access, and integrate information from multiple autonomously created data stores. Even if each of these data stores individually is simple, the entire collection may be very complex. Finally, information stores persist over time, and our needs change over this period. In consequence, the stores need to evolve with time, as do our queries of them. Information systems must be designed to adapt to this change.
This project develops techniques for the incremental design of XML-based information systems. Techniques are developed for the hierarchical decomposition of schemas and through this, resolution of multiple incomplete schema specifications considered as "views" of related information contexts. Automated techniques are also developed for mapping these into XML schema that are both suited for efficient query processing as well as being normalized to satisfy known functional dependencies and avoid update anomalies.
This project develops a formal conceptual basis for the design of complex heterogeneous information systems. It also develops incremental techniques to deal with evolutionary change over time, as well as with the merger (and loss) of data stores. If successful, this project will greatly reduce the human effort involved in the design and maintenance of information stores, particularly for designers and users of information systems who may not be familiar with the underlying technical details, such as domain experts who are interested in bringing together information from many heterogeneous information sources and whose information needs change rapidly. The likelihood of such impact will be maximized by working closely with a group of biologists who will serve as local users of the technology and provide feedback. The project Web site www.eecs.umich.edu/~jag/design/ will be used to disseminate the results, including software tools created in this project that we will make available free of charge for academic use.