Cloud computing provides economic advantages from shared resources, but security is a major risk for remote operations and a major barrier to the approach, with challenges for both hosts and the network. NEBULA is a potential future Internet architecture providing trustworthy networking for the emerging cloud computing model of always-available network services. NEBULA addresses many network security issues, including data availability with a new core architecture (NCore) based on redundant connections to and between NEBULA core routers, accountability and trust with a new policy-driven data plane (NDP), and extensibility with a new control plane (NVENT) that supports network virtualization, enabling results from other future Internet architectures to be incorporated in NEBULA. NEBULA?s data plane uses cryptographic tokens as demonstrable proofs that a path was both authorized and followed. The NEBULA control plane provides one or more authorized paths to NEBULA edge nodes; multiple paths provide reliability and load-balancing. The NEBULA core uses redundant high-speed paths between data centers and core routers, as well as fault-tolerant router software, for always-on core networking. The NEBULA architecture removes network (in) security as a prohibitive factor that would otherwise prevent the realization of many cloud computing applications, such as electronic health records and data from medical sensors. NEBULA will produce a working system that is deployable on core routers and is viable from both an economic and a regulatory perspective.

Project Report

Cornell's efforts within the NSF FIA-sponsored NEBULA focused on three topics. Our first activity solved a long-open problem in network routing. Previous research had shown that network disruptions often stem from transient instabilities within the network routers that results in slow delivery of data to the user. The user experiences choppy audio or video, slow page downloads, and such efforts could be dangerous in the future when we start to see control systems for the smart power grid, or for smart cars, that may depend on network access to databases and other services. Companies such as Cisco (with which we partnered) have made their hardware routers more reliable, but the software infrastructures have lagged. For example, Cisco showed us examples of routing instability that lasted two or three minutes, triggered by rebooting a router daemon: an event that would normally require less than a second to perform. In these episodes, the hardware and the routing daemon were quickly back to normal, yet the Internet would "ring like a bell." The brief disruption to the routing protocols employed at the Internet’s lowest levels could be seen to propagate widely, dying out only over an extended period of time. Clearly for the future Internet to play the intended roles, this kind of issue needs to be resolved. Within FIA, Cornell (Ken Birman, Robbert Van Renesse and PhD student Robert Surton) teamed with other groups to help Cisco demonstrate a solution to this problem. The approach used was to make small changes to the software used by the standard router daemon program, yielding a fault-tolerant version that stores all of its routing information in a special kind of data structure called a distributed hash table (DHT). A DHT remains available even if some router components fail. Our approach required minimal changes to the router daemon code: basically, the data structures within which it stored routing information were modified to store data in the DHT and to reload data from the DHT on restart, but was otherwise unmodifed. Cornell’s specific contribution was to show how we can completely mask the daemon restart events from the perspective of other remote routers. We did this by creating a new technology for unbreakable TCP network links. The work could also be useful in building other forms of highly available services, like cloud computing services that never crash – just the kinds of solutions that will be needed for smart cars, the smart power grid, and other continuously operational applications. We published this work in two venues: one article in the IEEE Network Computing magazine discussed the overall project, while a more technical one in the 2013 IEEE Dependable Systems and Networks conference focused on our TCP-R protocol. Our second big effort focused on the stability and performance of network traffic at very high data rates: 10Gb/s and beyond. In work led by Cornell’s Hakim Weatherspoon, we determined that network disruptions can sometimes be traced to unsteady data rates caused when high speed routers or network interfaces allow a steady data flow (of the kind needed for voice or video) to become bursty by batching messages and then releasing the batch all at once. Weatherspoon’s group created a new kind of hardware solution, called SONIC (the Software-Defined Network Interface Card) that can measure or control inter-packet timing at very high resolution levels: accurate to tens of picoseconds. SONIC makes it possible to pinpoint the causes of data loss in high speed networks, to study the way that precisely-spaced trains of packets are perturbed as they flow through the network, and even to detect (or create) covert channels that pass information by encoding it in the number of network idle characters interposed between legitimate packets. With standard network cards this would not have been feasible because the granularity of the needed timing measurements is much too fine to sense even with the most elaborate commercial network interface solutions. A number of papers on the work have appeared, including in the highly regarded NSDI and SIGCOMM-IMC conferences. Weatherspoon’s group is now deploying SONIC as part of a nationwide testbed aimed at making this form of ultra-precise measurement and instrumentation widely available within the public Internet and on the NSF GENI testbed system. A third effort began late in the FIA program and was headed by Zygmunt Haas, who studies the use of homomorphic computing for secure mobile cloud computing systems. Haas proposed a new method of delegating computations of resource-constrained mobile clients, in which multiple servers interact to construct an encrypted program known as "garbled circuit." The proposed system assures privacy of the mobile client's data, even if the executing server chooses to collude with all but one of the other servers. The scheme has been evaluated, demonstrating its feasibility for secure and verifiable cloud computing for mobile systems.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Network Systems (CNS)
Application #
1040689
Program Officer
Darleen L. Fisher
Project Start
Project End
Budget Start
2010-09-01
Budget End
2014-08-31
Support Year
Fiscal Year
2010
Total Cost
$1,005,199
Indirect Cost
Name
Cornell University
Department
Type
DUNS #
City
Ithaca
State
NY
Country
United States
Zip Code
14850