The objective of the proposed research is to systematically understand, evaluate and contribute towards the problem of membership inference in aggregate data publishing, a generic, novel, and dangerous privacy threat in a wide variety of real-world applications. The main idea proposed to address the problem of membership inference is an information-theoretic model of privacy disclosure as a noisy communication channel. Based on the channel coding theory and the recent advance in multi-input multi-output (MIMO) communication channels, the proposed research studies novel techniques for membership inference and explores the corresponding privacy-preserving mechanisms.
Intellectual Merit: The following salient features distinguish the proposed work from existing studies: (1) the proposed research studies a novel problem of membership inference in aggregate data publishing which stands in sharp contrast to the traditional inference control problem. In particular, the sensitive information in danger of disclosure in the proposed problem definition is the selection attributes of an aggregate query instead of its measure attributes which is the focus on traditional inference control. (2) This novel problem also leads to a set of novel solutions based on information theory. In particular, the propose research studies a model of membership inference attacks as modulation techniques in time and frequency domains for various types of communication channels, e.g., single-input single-output (SISO), multiple-input and single-output (MISO), single-input and multiple-output (SIMO), and multiple-input and multiple-output (MIMO) channels. This proposed channel model enables a uniform evaluation of the effectiveness of both membership inference and privacy-preserving techniques.
Broader Impact: The outcome of this research has broader impacts on the nation's higher education system and high-tech industries. The prospect of sensitive membership information disclosure techniques and privacy-preserving techniques can help the providers of aggregated data publishing, including national health organizations, Internet security service providers, etc., to secure their published data. The broader impact of this project also extends to academia. Parts of this project is carried out by students of George Washington University (GWU), Towson University (TU), and University of Massachusetts, Lowell (UML) as part of advanced class projects or individual research projects.