The past few years have witnessed a growing number of large-scale networked systems. Most of these systems are built following an overlay approach, with each of them regularly and independently probing its environment to guide path selection algorithms, route around faulty links and replicate content for availability. As these systems grow in popularity, such an approach will result in an unsustainable degree of monitoring and restrict the variety, number and span of distributed services.
The thesis of this project is that a large fraction of globally-distributed systems can be built to ensure sustainable scalability by strategically reusing the view of the network gathered by long-running, ubiquitous services such as CDNs and P2P systems. This work defines and explores "3R" - a new approach to the design and implementation of distributed systems that focuses on minimizing aggregated control and administrative overhead by strategically reusing environment's views and recycling previously gathered measurements. In particular, we are designing efficient techniques for maintaining, accessing and reusing this information for building next-generation streaming multicast, content distribution and data sharing applications.
The work explores the tradeoffs in the recycling/reuse of environmental measurements, and their implication in distributed-system design. Ultimately, it will facilitate deployments of wide-area applications, benefiting society at large. The research agenda is complemented by a thorough education and outreach plan for strengthening experimental systems education and contributing to minority recruitment and retention.
Over the past few years we have witnessed a growing number of Internet-scale distributed systems. Most of these systems are built following an overlay network approach, with each of them regularly and independently probing its environment to guide path selection algorithms, route around faulty links and replicate content for availability. As these systems grow in popularity, this "measure-yourself" approach will result in an unsustainable degree of monitoring and restrict the variety, number and span of distributed services. The thesis of this project is that a large fraction of globally distributed systems can be built to ensure sustainable scalability by strategically reusing the view of the network gathered by long-running, ubiquitous services such as Content Distribution Networks (CDNs) and Peer-to-Peer (P2P) systems. This work defined and explored '3R' - a new approach to the design and implementation of distributed systems that focuses on minimizing aggregated control and administrative overhead by strategically reusing existing systems’ views of a shared environment and recycling previously gathered measurements. As part of this effort, we have designed efficient techniques for maintaining, accessing and reusing this information for building systems such as next-generation streaming multicast, content distribution and data sharing applications. We have explored and experimentally evaluated approaches to reuse CDNs views of the Internet to drive network positioning system and a network detouring service. We have built and made publicly available a library for reusing this information to build networking positioning and detouring services. We have evaluated the reuse of P2P services to drive a service-level network event detection system, characterize access network services and content distribution services, and help map the constantly evolving Internet topology. We have built and made publicly available a BitTorrent software extension that reuses CDN network views to biased P2P connections, trying to reduce the network impact of P2P traffic while improving service performance for end-users. We have built and deployed SideStep and DraFTP. The SideStep library and the DraFTP service, available in PlanetLab, show the effectiveness of measurement reuse for performance-based detouring and demonstrate its practicality by helping users take advantage of these alternative routes between end hosts. We have built and made available another BitTorrent extension to cooperatively detect potential network anomalies, looking for corroboration among other users in the same networks. We have designed and evaluated a BiTorrent extension to improve users’ privacy by potential identifiable cliques in data sharing communities. We have also started to explore approaches to characterize broadband services received by end users, arguing that current approaches exhibit apparently unavoidable tradeoffs between coverage, continuous monitoring and capturing user-perceived performance, and that network-intensive applications running on end systems may help avoid these tradeoffs. Over the last five years, this project has also served to support six graduate students and three undergraduate students through the NSF program Research Experience for Undergraduates. In addition, at least three other graduate and several undergraduate students have contributed to and benefited from this effort. Many of the students involved have received multiple awards and recognitions for work done on this project, including a departmental Doctoral Dissertation Award, a nomination to the Association for Computing Machinery (ACM) Doctoral Dissertation Award, a Computing Research Association (CRA) Postdoctoral Fellowship and an Illinois’ 50 for the future award. Four of the undergraduate students have joined Ph.D. programs at the PI’s institution and elsewhere. In education, we have enhanced the content of several courses and developed a new research course on 'Distributed Systems in Challenging Environments', aimed at both senior undergraduates and graduates, that has attracted a significant number of students, many of them undergraduates, from different areas in and outside our department. The course is scheduled for a third iteration in Winter 2013.