Accountability in computer systems is typically provided by preserving a history of activities and data. This allows past events to be analyzed to detect breaches, maintain data quality, and to audit compliance with security policies. In some settings, however, retaining a history of past data or operations poses a serious threat to privacy. Privacy and accountability are both important goals, and system designers need to carefully manage the balance between them. This CAREER research is building a database system capable of securely managing history, thus balancing the needs for privacy and accountability. In settings that require it, the system is configurable as ``memoryless'', protecting privacy by resisting unauthorized attempts to trace activities or recover deleted data. This is achieved by removing data safely when it is deleted, providing an accurate view of the data that is retained, and by offering bounds on the lifetime of sensitive data items stored in the system. In other settings, the system supports accountability by retaining desired history, permitting its efficient analysis, and protecting it from unwanted disclosure.
The broader impact of this project includes a publicly-available prototype database system embodying the goals above, along with curriculum extensions that bring privacy and system accountability themes into undergraduate education. In addition to enriching existing programs at UMass Amherst for training undergraduates in Information Assurance, this project will foster collaboration among the campuses of the Five College consortium of Western Massachusetts. The project results including the prototype source code and documentation and the project publications will be made available via the project website (http://dbgroup.cs.umass.edu/securing-history).
There is often a difficult tension between the benefits of collecting and preserving information about individuals and the potential violations of personal privacy that can result from data retention. This project investigated technology for managing information that is collected while monitoring or analyzing the operation of computer systems. This data collection, which might include web logs, network traces, caches, or other historical data, often reveals sensitive information about individuals and their activities. At the same time, it offers accountability because it allows past events to be analyzed to detect breaches, maintain data quality, and to audit compliance with security policies. This project developed conceptual and technological advancements for intelligently negotiating the balance between collection/retention and privacy. The main contributions include the following: We studied the unintended preservation and recoverability of sensitive structured information stored in data management systems. We showed that many existing systems preserve more data than expected and provided new technologies for preventing this preservation which often violated privacy policies. We proposed an approach to sanitizing communication traces (e.g. network logs) which provides utility and scalability. Our approach consists of a set of simple, formally-defined transformation operators that are applied to the trace to remove or obscure sensitive information. These transformation operators can be combined to form composite transformations that can be applied to publish output traces, and can be thought of as a safe view of the original trace. We proposed a method for releasing synthetic system data that can be used in place of real system traces for the evaluation of performance. This avoids releasing sensitive traces and the synthetic data is provided with a strong guarantee that individuals’ privacy is guaranteed. Our work provides a basis for data owners to reconcile their conflicting needs and make accurate decisions about what historical information should be retained and how it may be used effectively. The results from this project were disseminated domestically and internationally, and were incorporated into the curriculum of undergraduate and graduate courses. This project contributed to the cybersecurity workforce by supporting and training undergraduate and graduate research assistants.