This research project focuses on developing a data integration and transformation process that supports maintainability, adaptability, and evolution. Data integration systems are software systems that permit the transformation, integration, and exchange of structured data that has been designed and developed independently. The often subtle and complex interdependencies within data can make the creation, maintenance, and use of such systems quite challenging. The PI, with the collaborator Renee Miller (University of Toronto) have available a robust arsenal of tools and mechanisms for reconciling semantic differences in how data is represented including views, mappings, and transformation languages. The major focus is on the maintenance of the metadata necessary to achieve semantic integration and sharing of data. This project develops an integration and transformation process that is designed for evolution. The research will develop a new theory of metadata discovery and adaptation based on modern statistical learning and a new theory of the data integration process, that supports not only automation, but also maintainability, adaptability and evolution. The major contribution of this research will be the development of a design process that supports robust data sharing, an crucial aspect in the Science of Design. As part of the broader impacts of this work, the expected results will contribute to an enhanced infrastructure for research by developing a benchmark for schema and mapping discovery and management tasks. The research results and the benchmark that will be accessible on the project Web site (www.cs.umd.edu/~getoor/sod) to facilitate dissemination of knowledge and tools to a variety of scientific communities. The methods developed in this project will be of particular value to scientists who routinely need to gather, manage, and integrate diverse data sets. The researchers also plan to partner with industry collaborators in order to learn from and address industry needs, receive feedback, and facilitate technology transfer.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
0438866
Program Officer
Ephraim P. Glinert
Project Start
Project End
Budget Start
2005-01-01
Budget End
2009-12-31
Support Year
Fiscal Year
2004
Total Cost
$610,200
Indirect Cost
Name
University of Maryland College Park
Department
Type
DUNS #
City
College Park
State
MD
Country
United States
Zip Code
20742