EAGER: Secure Workflow Provenance for Collaborative Data Analytics

Zhang, Jia; Lu, Shiyong

Abstract

Collaborative data analysis has become a necessity and trend in the era of big data. In such collaborative environments, intellectual property protection mechanisms are critical to maintain and encourage research partnerships. Such mechanisms shall protect not only data sources and data analysis algorithms, but also protect data provenance, i.e., data processing history. For example, participating parties may request to secure their access and sharing of various kinds of data products (source, intermediate, and final), processing steps, and their inter-dependencies. However, existing mechanisms do not provide such fine-grained protection on multi-step data analytics procedure (workflow) provenance. To address such a challenge, this project aims to study and explore novel mechanisms to secure the access and querying over collaborative scientific workflow provenance.

The technical goal of this project is to understand in depth aboutÂ the feasibility of dataflow provenance-oriented access and querying mechanisms. This high-risk, high-reward work will produce the following two outcomes: (1) a multi-level fine-grained secure provenance access and querying mechanism for provenance collection, including sensitive data as well as sensitive dependencies between data, tasks, and users, and (2) automated analysis algorithms to ensure that provenance access and querying policies would conform to desirable constraints on evolving dependencies. The intended techniques and tool will be evaluated in genomic data analysis domain to demonstrate its usability and significance in the context of collaborative data analytics. The expected techniques will be equipped to the NSF-sponsored collaborative scientific workflow tool for secure data analytics collaboration.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Computer and Network Systems (CNS)
Type: Standard Grant (Standard)
Application #: 1747095
Program Officer: Wei-Shinn Ku

Project Start
Project End
Budget Start: 2017-09-01
Budget End: 2020-08-31
Support Year
Fiscal Year: 2017
Total Cost: $200,000
Indirect Cost

EAGER: Secure Workflow Provenance for Collaborative Data Analytics
Zhang, Jia Lu, Shiyong
Carnegie-Mellon University, Pittsburgh, PA, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments