The goal of this research project is to develop methods for bridging semantic heterogeneity. Semantic heterogeneity arises in contexts where data needs to be shared among multiple data sources and applications, and these sources use different terminologies. For example, companies own a large number of databases, and need to coordinate between them in order to leverage their value. Similarly, large-scale scientific projects and coordination among government agencies also requires sharing data across multiple repositories. The approach consists of collecting a large number of schemas in a particular domain and trying to learn the patterns and variations on patterns that database designers use in the domain. By leveraging such patterns, it is possible to match between previously unseen database schemata in the domain. The techniques are validated by developing systems for matching between disparate schemata, and by applying the techniques to searching the growing number of web-services available today on the World Wide Web. One of the systems being built by this research is a search engine for web services that attempts to get at the underlying meaning of the web-service operations and will be available from the University of Washington (http://data.cs.washington.edu/schemaMatching/index.htm). The results of the project will provide a set of online services as well as public data sets that can be used by the research community. Possible direct applications of this research include biomedical informatics and deep-web search. The results

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
0415175
Program Officer
Sylvia J. Spengler
Project Start
Project End
Budget Start
2005-09-01
Budget End
2009-08-31
Support Year
Fiscal Year
2004
Total Cost
$270,000
Indirect Cost
Name
University of Washington
Department
Type
DUNS #
City
Seattle
State
WA
Country
United States
Zip Code
98195