Group based anonymization is the most widely studied approach for privacy preserving data publishing. This includes k-anonymity, l-diversity, and t-closeness, to name a few. The goal of this proposal is to raise a fundamental issue on the privacy exposure of this approach which has been overlooked in the past and come out with a computationally efficient solution. The group based anonymization approach basically hides each individual record behind a group to preserve data privacy. However, patterns may still be derived or mined from the published anonymized data and be used by the adversary to breach individual privacy. The objective of this research is therefore to develop novel group-based anonymization methods that can defend against such an attack. The first part of the project is to define the attack problem, i.e., the published anonymized data can in fact be mined for privacy attacks. It identifies and formulates the privacy exposure to such an attack. The second part is to conduct a systematic study on the exposure of existing privacy techniques to the attack. The third part is to derive the condition that is able to resist such an attack and develop efficient data publishing algorithms to prevent it from occurring.

Project Report

Due to the rapid advancement in storing, processing, and networking capabilities of computing devices, there has been a tremendous growth in the collection of digital information about individuals. The collected data offer tremendous opportunities for mining useful information, such as research on patient records to devise personalized medicine, research on trading records to devise more effective policy to avoid the financial meltdown of banking systems, etc. However, there is also a threat to privacy because data in raw form often contain sensitive information about individuals. Privacy-preserving data publishing (PPDP) studies how to transform raw data into a version that is immunized against privacy attacks but that still supports effective data analysis. Our work identifies the weakness of current PPDP approaches and comes out with novel alternatives that will make sharing of valuable data safer and more likely so that the rich information can be used to build a better society. Group based anonymization is the most widely studied approach for privacy-preserving data publishing. Our work is the first to identify its exposure. ? -differential privacy is another approach designed for an interactive querying model. We propose a novel data publishing approach for the non-interactive setting based on ? -differential privacy. The work creates an awareness of the weakness of the current privacy preserving data publishing schemes and provides an alternative approach by extending a privacy preservation scheme designed for an interactive query model to the non-interactive data publishing model. This will facilitate the sharing of data to advance data-driven research.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
0914934
Program Officer
Sylvia J. Spengler
Project Start
Project End
Budget Start
2009-09-01
Budget End
2013-08-31
Support Year
Fiscal Year
2009
Total Cost
$499,831
Indirect Cost
Name
University of Illinois at Chicago
Department
Type
DUNS #
City
Chicago
State
IL
Country
United States
Zip Code
60612