Collaborative data analysis has become a necessity and trend in the era of big data. In such collaborative environments, intellectual property protection mechanisms are critical to maintain and encourage research partnerships. Such mechanisms shall protect not only data sources and data analysis algorithms, but also protect data provenance, i.e., data processing history. For example, participating parties may request to secure their access and sharing of various kinds of data products (source, intermediate, and final), processing steps, and their inter-dependencies. However, existing mechanisms do not provide such fine-grained protection on multi-step data analytics procedure (workflow) provenance. To address such a challenge, this project aims to study and explore novel mechanisms to secure the access and querying over collaborative scientific workflow provenance.

The technical goal of this project is to understand in depth about  the feasibility of dataflow provenance-oriented access and querying mechanisms. This high-risk, high-reward work will produce the following two outcomes: (1) a multi-level fine-grained secure provenance access and querying mechanism for provenance collection, including sensitive data as well as sensitive dependencies between data, tasks, and users, and (2) automated analysis algorithms to ensure that provenance access and querying policies would conform to desirable constraints on evolving dependencies. The intended techniques and tool will be evaluated in genomic data analysis domain to demonstrate its usability and significance in the context of collaborative data analytics. The expected techniques will be equipped to the NSF-sponsored collaborative scientific workflow tool for secure data analytics collaboration.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Network Systems (CNS)
Type
Standard Grant (Standard)
Application #
1747095
Program Officer
Wei-Shinn Ku
Project Start
Project End
Budget Start
2017-09-01
Budget End
2020-08-31
Support Year
Fiscal Year
2017
Total Cost
$200,000
Indirect Cost
Name
Carnegie-Mellon University
Department
Type
DUNS #
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213