Our society is in the midst of a great revolution due to Big Data. This revolution affects every aspect of our lives from online to offline, from business to science. Privacy today is both a major social concern and an intellectually challenging problem; the problem is exacerbated by the Big Data trend. Differential privacy is a principled approach to achieving the goal of privacy which, conceived as it was in the context of large databases, is especially suited for Big Data. This workshop will bring together researchers on differential privacy with the world's leading Big Data theoreticians to congregate at the Simons Institute in the Fall of 2013 to create a rare opportunity for true breakthroughs both in Privacy and in Big Data.
The Simons Institute will be hosting a special semester on "Theoretical foundations of Big Data analysis" during Fall of 2013, which will bring together several experts in analysis of big data. The purpose of the workshop will be on the one hand to develop new differential-privacy-inspired techniques for allowing large-scale data analysis without threatening the privacy of individuals, and on the other hand for the two communities to explore together the statistical concepts and techniques which are at the basis of both. The four-day workshop on "Big Data and Differential Privacy" will take place December 11-14, 2013 at the Simons Institute for the Theory of Computing, located in Berkeley, CA.
Workshop Award Number 1346565 Simons Institute for the Theory of Computing, Dec. 11 – Dec. 14, 2013 Richard Karp, Director of the Simons Institute The organizers of the workshop were Kunal Talwar (Microsoft Research, chair), Avrim Blum (Carnegie Mellon University), Kamalika Chaudhuri (UC San Diego), Cynthia Dwork(Microsoft Research), and Michael Jordan (UC Berkeley). Overview: Analysis of large datasets of potentially sensitive private information about individuals raises natural privacy concerns. Differential privacy is a recent area of research that brings mathematical rigor to the problem of privacy-preserving analysis of data. Informally the definition stipulates that any individual has a very small influence on the (distribution of the) outcome of the computation. Thus an attacker cannot learn anything about an individual's report to the database, even in the presence of any auxiliary information she may have. A large and increasing number of statistical analyses can be done in a differentially private manner while adding little noise. This has been made possible in part by deep connections to learning theory, convex geometry, communication complexity, cryptography and robust statistics. This workshop brought together differential privacy researchers and statisticians, with the goal of exploring connections between the two fields: from enabling practical, accurate and differentially private data analyses on large datasets, to connections between statistical concepts (such as robustness, sparse regression, and multiple hypothesis testing) and differential privacy.