Current information technology enables many organizations to collect, store, and use massive amount and various types of information about individuals. Governments and other organizations increasingly recognize the critical value and enormous opportunities in sharing such a wealth of information. However data privacy has been a major barrier for such information sharing, bringing much attention to privacy preserving data publishing and analysis techniques. Differential privacy is widely accepted as one of the strongest unconditional privacy guarantees. While many effective mechanisms have been proposed for the interactive model with differential privacy, non-interactive data release with differential privacy remains an open problem with the recent years only see negative results.
This project aims to build a data-driven and adaptive framework for differentially private data release. It circumvents the hardness of differentially private data release in the non-interactive setting by novel and sophisticated use of the differentially private primitives exploiting the characteristics of the underlying data. The specific research objectives include: (1) design adaptive query strategies for releasing data with differential privacy, including traditional relational data and high dimensional and sparse set-valued data, (2) design statistical inference technique to accurately answer user queries using the released data, and (3) design algorithms to model and incorporate potentially dynamic workload characteristics in the framework. In addition to formal analysis and experimental evaluations, the project will evaluate and integrate the developed solutions in real health applications at Emory University to support health research while providing rigorous privacy guarantee.
Success of the proposed research will help overcoming barriers for large scale data sharing and will have broader impacts to large societies beyond the field of data privacy and information management. The proposal also includes a set of closely integrated educational activities including new course development on data privacy and security emphasizing a strong interdisciplinary aspect, continued involvement of undergraduate students in research, and encouragement of women and minority for participation. The project also closely aligns with Emory's university-wide strategic initiatives in Predictive Health and will help develop the new Ph.D. program in Computer Science and Informatics at Emory University.