While we are increasingly dependent on networked, distributed systems, current distributed systems require extensive manual effort to deploy, are expensive to administer, and tend to be fragile nevertheless. These problems stem fundamentally from the fact that current distributed systems lack principled design and depend on unreliable heuristics for their operation.

We are investigating new techniques for building robust, high-performance infrastructure services, based on novel analytical and numerical techniques for optimally or near-optimally performing resource allocation in large-scale distributed systems. We are developing new techniques that enable a new generation of highly reliable, self-organizing infrastructure services. In particular, we are developing a next-generation name service to replace the Domain Name Service that seamlessly supports the existing name hierarchy, a high-performance web cache that achieves predictably high performance by delivering hot data items close to clients proactively, and an eternal software repository that minimizes download time for swarming, bittorrent-style downloads.

This work lays a foundation for a principled understanding of resource tradeoffs in distributed systems that are currently performed either manually or in an ad hoc fashion. The new techniques are directly applicable to self-organizing peer-to-peer systems and lead to performance, manageability and availability properties that are difficult to achieve via conventional means. The application of these techniques to the implementation of infrastructure services will yield more robust replacements for fragile and expensive services, such as DNS, web caches, and software distribution than we use today.

Project Report

??This grant focused on building self-managing, self-administering, and self-optimizing large scale systems. Such systems span both peer-to-peer systems as well as large scale Internet-scale applications. The activities undertaken through this grant can be divided into three thrusts: self-organizing P2P systems (CoDoNS, CobWeb and Corona projects), localization systems (Meridian, Cubit, Octant projects), and content distribution systems (Antfarm and V-Formation projects). Overall, the outcomes from these efforts include the following: My group developed a new class of distributed systems that have the potential to provide much higher performance than state of the art systems, while simultaneously achieving higher robustness to churn, changes in workload, and network conditions. The key insight behind this work is the use of mathematical optimization techniques for key resource allocation decisions in distributed systems. This principled approach can achieve much better performance than parameters selected statically by humans or heuristically by computes. My work provides the theoretical and practical foundation for this approach. My group demonstrated the practicality of the aforementioned approach by building three applications: a safety-net for the Domain Name System, a high-performance web proxy, and a publish-subscribe system. These services were built, deployed and operated to provide free services to users on the Internet. We collected the first empirical trace for a realistic, large-scale publish-subscribe system. This trace was used by various researchers to evaluate their publish-subscribe systems. My group developed techniques for efficiently locating nearby nodes on the Internet. This is a bsaic building block service used in web services, content distribution systems as well as large-scale networked gaming systems. We also developed techniques for determining the real-world location of a computer based on measurements. This is a challenging task, as network measurements tend to be noisy. The framework we developed for this purpose is precise, accurate and general enough to subsume most past work on geolocalization into a unified fold. My group has developed new systems for optimally distributing content. Content distribution is a significant problem, accounting for around 60% of flows and 80-90% of all bytes transferred on the Internet. Our techniques make this process more efficient, especially for content distributors in possession of large content libraries. Overall, this grant enabled my group to broadly examine principled techniques for the construction of robust Internet-scale services. This effort led to new approaches for solving critical problems. The approaches we have developed are practical, as demonstrated through their application to common real-world problems. My group has a track record of building real working artifacts to demonstrate the utility of our proposed new approaches. In this instance, these artifacts led to systems used by millions of anonymous Internet users. The intellectual developments either form the current state of the art in their respective areas, or have been built upon by other researchers who have extended the ideas therein.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Network Systems (CNS)
Application #
0546568
Program Officer
Krishna Kant
Project Start
Project End
Budget Start
2006-02-01
Budget End
2011-01-31
Support Year
Fiscal Year
2005
Total Cost
$400,000
Indirect Cost
Name
Cornell University
Department
Type
DUNS #
City
Ithaca
State
NY
Country
United States
Zip Code
14850