Network operators must control and monitor the flow of information within and across their networks. Existing mechanisms for controlling information flow are primarily host-based: operating systems can "taint" portions of memory or applications based on the inputs to a particular process or resource. Unfortunately, if a host is compromised or otherwise breached, that information may propagate in unintended ways. Once information has leaked, tracking the provenance of the leaked data is challenging. This project is developing a mechanism for tracking and controlling information flow across the network to cope with these problems. This mechanism would allow operators to control how information propagates within and between networks and to devise more complex policies; for example, it might be used to control which application traffic was allowed on which part of the network. The information carried in the network traffic might ultimately be attributed to a specific user or process, thus allowing operators to express policies according to the process and user that generated the traffic.
We are addressing several research challenges. First, we are exploring the appropriate granularity for tainting that preserves semantics without imposing unacceptable memory and performance overhead. Second, we are designing the system to minimize performance overhead on applications. Third, we are exploring translation mechanisms between host-based taints and network-based taints, so that taints carried in network traffic convey meaningful semantics without imposing prohibitive network overhead. The research will result in an information tracking and control system that is deployed in experimental settings (e.g., the Georgia Tech campus network) using the existing and forthcoming programmable switch implementations, and integrated into undergraduate and graduate networking and security courses.