Information flow is a central concept in computer security, yet it is still an open problem to tag information in a running system and track how the information flows throughout the system in an accurate manner. We are developing the fundamental concepts in control theory, information theory, and systems to solve this problem using what we call a relaxed static stability approach.
In a running system, as information is cut-and-pasted by users or processed, it flows in unexpected ways. Two major challenges are address dependencies and control dependencies. Overtagging these dependencies causes the entire system to quickly become tagged, while undertagging them means that important flows of information are not tracked. Modern fighter jets and stealth aircraft are designed without inherent stability, then advanced digital "fly-by-wire" systems are incorporated into the design to create a stable system that can actually fly. By applying this same kind of "relaxed static stability" approach, we are designing an accurate dynamic information flow tracking system that makes the right tradeoffs between overtagging and undertagging. This will enable whole new classes of applications based on dynamic information flow tracking, ranging from digital forensics and malware analysis to data provenance. By addressing a fundamental need in security and privacy research, we expect our work to have impact in any field where the flow of information in a computer is important to understand.
Dynamic Information Flow Tracking (DIFT) is a method for understanding how information flows through a system while a program is running. As part of this grant, we have explored ways to improve DIFT using control theory, explored new applications of DIFT in cyberphyiscal systems, and developed a better understanding of information flow in the network protocol stack implementations that help operating systems connect to the Internet. Outreach to New Mexico middle school and high school students and curriculum development were also important aspects of the project. Here we briefly describe a few examples of our achievements. This grant has produced two MS students who were fully supported by the award, and two Ph.D. students who were partially supported by the award. Of the two Ph.D. students graduated, one was from a group traditionally underrepresented in computer science. Both MS students were from groups traditionally underrepresented in computer science, one (Maria Khater) is now in a Ph.D. program at Virginia Tech and the other (Rafael Figueroa) also plans to pursue a Ph.D. Mohammed Al-Saleh (Ph.D., 2012) is now a tenure-track faculty member in the Computer Science Dept. at the Jordan University of Science and Technology. Roya Ensafi defended her dissertation in November 2014 and will begin a post-doc at Princeton University on January 1st, 2015. The original proposed effort was to apply control theory towards making DIFT more accurate. Our results from this effort are promising, and were presented in Maria Khater's Master's thesis. A new research direction, that also represents a potential new collaboration between the PIs, is the application of DIFT to cyberphysical systems. In Rafael Figueroa's Master's thesis DIFT is used to take human inputs and their effect over time and accumulate and classify them in a continuous range between spurious or legitimate inputs. In this way, a cyberphysical system such as an unmanned vehicle can recover from a malicious input source. Also as part of this grant, we have gained a better understanding of how information flows in network stacks. Network stacks are the protocol implementations that operating systems use to communicate on the Internet. One application of this work is that we developed a novel TCP/IP side channel, called the hybrid idle scan, that makes it possible to determine whether (almost) any two IP addresses in the world are able to communicate with each other, or if some firewall in between (e.g., for censorship reasons) is preventing them from sending packets to each other. We are working with U.C. Berkeley's International Computer Science Institute (ICSI), the Tor Project, the University of Toronto Citizen Lab, and other research groups to both carry out world-wide measurement of Internet censorship over time and to focus on specific countries to understand how they block censorship circumvention tools such as Tor. In another application of TCP/IP side channels, we showed that it is possible to infer how many packets a remote Linux server is sending to any other IP address in the world. This led to a Linux kernel patch to protect people from, e.g., state actors who may attempt to de-anonymize communications in this way. We have also carried out curriculum development and local outreach efforts. We developed a game called Werewolves that can be used to teach students about the Linux command line, basic cybersecurity concepts, and (for more advanced students) information flow and side channels. In the summer of 2014 we hosted the first UNM Cybersecurity Bootcamp, in collaboration with UNM's Anderson School of Management. We hosted about 35 high school students, approximately half of whom were from groups traditionally underrepresented in computer science. We also were involved with the New Mexico Supercomputing Challenge (a state-wide competition that reaches hundreds of high school and middle school students every year) and taught a Winterim course on computer programming at Bosque Middle School in Albuquerque, New Mexico.