We propose to develop a model in which laboratory genome database users freely interlink local databases and public genome databases. By interlinking we mean users of a laboratory genome database can form and issue ad hoc queries that entail cross-database joins between the local database and public genome databases in real-time. Specifically, we study the interoperability between the laboratory database DB/12 and two public genome databases, GDB and GSDB. In a proposed scenario, users/developers of DB/12 will be able to form and issue join-queries among DB/12, GDB and GSDB relatively quickly, even if they are not familiar with the schemas of GDB and GSDB. In another scenario, users/developers of other laboratory genome databases also use our proposed tool to interlink its own database and the public portion of DB/12, GDB and GSDB. The key component of our proposed model is the graphical ad hoc query interface that is designed to help users deal with unfamiliar third party database schemas and therefore eases users' query formulation process. Our specific goal is to enable them to form SQL queries graphically within 5 - 10 minutes despite unfamiliarity with the third-party database schemas of the federation. Our approach creates and uses meta-data describing schema relationships between the existing genome databases that participate in the federation. This study will clarify (i) what types of meta-level information about database schemas are necessary for making interoperability between genome databases feasible, and (ii) what is the most effective way of organizing and storing such meta-level information for efficient mutual use.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG000772-04
Application #
2026819
Study Section
Genome Study Section (GNM)
Project Start
1993-09-28
Project End
1998-12-31
Budget Start
1997-01-01
Budget End
1997-12-31
Support Year
4
Fiscal Year
1997
Total Cost
Indirect Cost
Name
University of Connecticut
Department
Engineering (All Types)
Type
Schools of Engineering
DUNS #
City
Storrs-Mansfield
State
CT
Country
United States
Zip Code
06269
Cheung, K H; Nadkarni, P; Miller, P et al. (1998) Automatic query mapping among genomic databases: a pilot exploration. Proc AMIA Symp :942-6
Cheung, K H; Nadkarni, P M; Shin, D G (1998) A metadata approach to query interoperation between molecular biology databases. Bioinformatics 14:486-97