CIF: Small:  Privacy and Utility of Databases:  An Information-Theoretic Approach

Poor, Harold Vincent; Sankar, Lalitha

Abstract

Information technology and electronic communications have been rapidly applied to every sphere of human activity, including commerce, medicine and social networking. The concomitant emergence of myriad large centralized searchable data repositories has made "leakage" of private information via data correlation (inadvertently or by malicious design) an important and urgent societal problem. Maintaining the usefulness of these data sources while also providing necessary privacy guarantees is an important unsolved problem. This problem drives the need for an overarching analytic framework that can tell us unequivocally how safe private data can be (privacy) while still providing useful benefit (utility) to multiple legitimate information consumers.

This research develops a unified framework to study the utility-privacy tradeoff irrespective of the type of data source or method of perturbation. Techniques and results from rate-distortion theory are used to model data sources, develop application independent utility and privacy metrics, and develop a side-information model for dealing with questions of external knowledge. The framework, applicable for single query data source models, is extended to study the utility-privacy tradeoffs for multiple-query models. Also studied is a successive disclosure problem which draws on classic results in successive refinement to develop the conditions under which multiple queries result in no additional information loss. The universal framework developed includes tools and techniques to bridge the gap between the information-theoretic model and current approaches and the dominant theoretical framework in computer science.

Project Report

The ubiquity of technologies such as on-line data repositories, biometric identification systems, financial (e.g., credit card) databases, healthcare information systems, smart electricity meters, etc., has created new challenges in information security and privacy. The research pursued under this project has developed a fundamental framework for examining, in a general setting, the tradeoff between the privacy of data in such systems and its measurable benefits. Although earlier approaches have considered the issue of data privacy alone, this new ability to understand the basic tradeoff between data privacy and the usefulness of data provides a means for developing methods and protocols for use in practical applications. This new methodology has been applied under the support of this grant to specific applications of this methodology in the areas of smart electricity metering, biometric identification systems, and general databases. The importance of this work to society is that it provides a way to understand the basic tradeoffs inherent in using data and in keeping data private. These are opposing goals, and so they must both be considered when designing protocols for information systems. Given the widespread and growing importance of this issue. The research conducted under this grant has the potential for guiding this critical area of technology development.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Computer and Communication Foundations (CCF)
Type: Standard Grant (Standard)
Application #: 1016671
Program Officer: Phillip Regalia

Project Start
Project End
Budget Start: 2010-09-01
Budget End: 2014-08-31
Support Year
Fiscal Year: 2010
Total Cost: $333,493
Indirect Cost

CIF: Small: Privacy and Utility of Databases: An Information-Theoretic Approach
Poor, Harold Vincent Sankar, Lalitha
Princeton University, Princeton, NJ, United States

Abstract

Project Report

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Project Report

Funding Agency

Institution

Comments