Proposal Number: 0831409 Principal Investigator: David Kotz Institution: Dartmouth College Proposal Title: CT-ISG: Dartmouth Trace Sanitization Framework

Project Summary

Computer-network research advances more quickly when researchers are able to analyze the activity of live computer networks. Because it is difficult to collect traffic traces from production computer networks, it is critical for the network-research community to share traces. Researchers who capture network traces and wish to share them, of course, must properly "sanitize" the trace to remove sensitive information. Sanitization always involves a challenging trade-off between sanitization effectiveness (providing anonymity for network users and secrecy for network operational information) and research usefulness (since only the information retained can be used by the researcher). This project aims to increase network-trace sharing by making it safer and easier to sanitize network traces. To this end, the project will develop and release NetSANI (Network Trace Sanitization and ANonymization Infrastructure), a flexible and extensible suite of software tools for sanitizing network traces, based on user-specified sanitization goals and user-specified research goals. The tools will be verified on extensive traces collected at Dartmouth College, and evaluated by providing early releases to external collaborators who will test the tools on their traces. The NetSANI project will have broad academic and practical impact: (a) better tools will enable and encourage more network-trace sharing, which helps the research community do better research, (b) better access to network traces will help companies develop better network products, and (c) better anonymization methods will protect network users' privacy. The outreach efforts will help spur the research community into defining norms and best practices for trace sanitization. Finally, the project will involve graduate and undergraduate students in research and incorporate research results in courses.

Project Report

Researchers who develop new network protocols, or new mobile services, often need to test their ideas by conducting simulations. These simulations are best driven by traces of production networks that involve real mobile devices. To obtain such traces, however, requires enormous effort and infrastructure; thus, the research community should be encouraged to share such traces with other researchers. In the wireless-network research community, there is a site designed to support such sharing: the Community Resource for Archiving Wireless Data At Dartmouth (CRAWDAD.org), originally funded by the NSF CRI program. CRAWDAD is used by over 3,400 researchers in 83 countries around the world, and new datasets are added every month. However, before network traces can be shared through such a service, the trace must be "sanitized" to remove any personal information (such as the identities of individuals whose network traffic was collected, including quasi-identifiers like mobile device identifiers) or proprietary information (such as the details of an enterprise network where the trace was collected). It turns out to be quite difficult to sanitize a network trace such that it successfully removes the sensitive information and yet retains enough information to be useful for researchers. In this project, we developed a software framework -- called NetSANI -- that aims to make it easier for a trace publisher to experiment with different trace-sanitization tools (and configurations of those tools), and the measurement of both anonymity and utility of the resulting trace. We surveyed researchers to learn about their needs and practices, and published the results of that survey. We surveyed the literature of network-trace anonymization, and published an overview. We also discovered new methods for de-anonymizing a trace of user mobility records -- that is, to re-identify users within a supposedly anonymous trace -- and explored various countermeasures. Finally, we developed a novel tool for collecting traces from production Wi-Fi networks much more efficiently and effectively than earlier tools.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Network Systems (CNS)
Type
Standard Grant (Standard)
Application #
0831409
Program Officer
Vijayalakshmi Atluri
Project Start
Project End
Budget Start
2008-09-01
Budget End
2011-08-31
Support Year
Fiscal Year
2008
Total Cost
$343,088
Indirect Cost
Name
Dartmouth College
Department
Type
DUNS #
City
Hanover
State
NH
Country
United States
Zip Code
03755