Buneman, Peter University of Pennsylvania $174,951

DLI Phase 2 - DATA PROVENANCE

This project will address issues associated with data provenance. Provenance is concerned with how information has arrived at the form in which appears -who produced it, who has corrected it, how old it is, it was originally produced, and so forth . Understanding provenance has occupied scientists, historians, textual critics and other scholars for centuries.

The provenance of data in databases is a newer and larger problem, because one is interested in data at all levels of granularity - from a single pixel in a digital image to a whole database. Just as scholars comment on documents by attaching annotations (marginalia) to text, part of the solution to recording provenance is the attachment of annotations to components of databases. Database researchers have recently considered loosely structured forms of data and have developed software systems for querying and storing such data. This work is closely related to new formats that have been developed for structured documents on the Web. It is expected that this technology will provide the substrate for recording and tracking provenance by advancing new data models, new query languages and new storage techniques.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
9817444
Program Officer
Stephen Griffin
Project Start
Project End
Budget Start
1999-06-01
Budget End
2003-05-31
Support Year
Fiscal Year
1998
Total Cost
$504,988
Indirect Cost
Name
University of Pennsylvania
Department
Type
DUNS #
City
Philadelphia
State
PA
Country
United States
Zip Code
19104