CAREER: Scalable Information Flow Monitoring and Enforcement through Data Provenance Unification

Bates, Adam

Abstract

System intrusions have becoming more subtle and complex. Attackers now covertly observe and probe systems for prolonged periods before launching devastating attacks. In such an environment, it has grown prohibitively difficult for system administrators to identify suspicious events, correlate these events into an attack pattern, and determine an appropriate response. Data Provenance is a method of modeling a system's execution in the form of a causal relationship graph, allowing investigators to trace the ancestry of data objects and identify relationships between seemingly independent events. The goal of the proposed work is to develop techniques that enable the use of data provenance as an expressive and efficient monitoring tool in large distributed systems. These mechanisms will enable unprecedented capability to reason about system events, centrally monitor activities within data centers, and express fine-grained enforcement of security properties based on the historical flow of data. Research and software artifacts will be made available to the broader community through the Linux provenance web site.

The proposed work will examine central challenges related to expressivity and scalability that currently prevent the further proliferation of provenance-based auditing techniques. To address the semantic gap that has traditionally prevented system-layer auditing from being able to explain higher-level application behaviors, this project pursues the design of universal provenance mechanisms that leverage binary analysis to transparently identify siloed application-layer logging activities, extract their semantics, and graft the information onto a causal relationship graph that encodes the entire system's execution. Grammar induction techniques will be leveraged to overcome the tremendous storage burden of provenance and provide a scalable central monitoring framework for data centers. After enriching system-layer auditing and enabling the efficient communication of suspicious activities via provenance traces, data provenance will be integrated into enforcement mechanisms to address critical security challenges including regulatory compliance, information flow control, and fault attribution. The advancement of state-of-the-art of provenance-based tracing and enforcement should establish a new baseline for reasoning about the flow of data in today's complex computing systems.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Computer and Network Systems (CNS)
Application #: 1750024
Program Officer: Phillip Regalia

Project Start
Project End
Budget Start: 2018-04-01
Budget End: 2023-03-31
Support Year
Fiscal Year: 2017
Total Cost: $305,833
Indirect Cost

CAREER: Scalable Information Flow Monitoring and Enforcement through Data Provenance Unification
Bates, Adam
University of Illinois Urbana-Champaign, Champaign, IL, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments