Natural language privacy policies have become a de facto standard to address expectations of notice and choice on the Web. Yet, there is ample evidence that users generally do not read these policies and that those who occasionally do struggle to understand what they read. Initiatives aimed at addressing this problem through the development of machine implementable standards or other solutions that require website operators to adhere to more stringent requirements have run into obstacles, with many website operators showing reluctance to commit to anything more than what they currently do. This project offers the prospect of overcoming the limitations of current natural language privacy policies without imposing new requirements on website operators.
This frontier project builds on recent advances in natural language processing, privacy preference modeling, crowdsourcing, formal methods, and privacy interfaces to overcome this situation. It combines fundamental research with the development of scalable technologies to semi-automatically extract key privacy policy features from natural language website privacy policies and present these features to users in an easy-to-digest format that enables them to make more informed privacy decisions as they interact with different websites. Work in this project also involves the systematic collection and analysis of website privacy policies, looking for trends and deficiencies both in the wording and content of these policies across different sectors and using this analysis to inform ongoing public policy debates. An important part of this project is to work closely with stake holders in industry to enable the transfer of these technologies to industry for large-scale deployment.