The goal of this project is to provide a principled way of quantitatively characterizing the effect of disclosing private data. Based on statistical decision theory, the proposed framework incorporates user-defined sensitivity information and identification model into a personalized risk function. The risk is intuitive and interpretable as it is based only on a user-specified loss function and elementary laws of probability and statistics. The proposed framework leads to a more accurate measure of the consequences of popular disclosure policies such as k-anonymity as well as efficient search for novel optimal policies.
Currently, private data is being disclosed according to general policies that do not necessarily reflect users preferences. The novel framework will let users obtain a quantitative grasp on the consequences of current data disclosure policies. Due to the simplicity and interpretability of the risk this will apply, in particular, to people lacking in technical or scientific education that otherwise remain uninformed about the use of their private data. Effective dissemination of the research results to industry and the popular press have the potential to transform current disclosure policies to become more focused on serving the needs of the community. The project also aims to enhance graduate and undergraduate education in the interdisciplinary area of statistical approaches to privacy preservation. Outreach efforts include mentoring of minority students in science and technology. The results of this project are disseminated via the web-page www.ecn.purdue.edu/~lebanon/privacyRisk.