Sound Internet research requires comprehensive network measurements. This work is motivated by the observation that, collectively, peers in large-scale peer-to-peer systems (P2P) have a unique and valuable perspective on network conditions, one to which today?s researchers, network operators and users have limited or no access.
For over a year the principal investigator (PI) has made available a BitTorrent extension aimed at reducing the known problem of cross-ISP traffic in P2P systems. This extension reduces off-network traffic through biased peer selection using a lightweight approach based on reusing network information from commercial CDNs, and it has been shown to be highly effective in the wild. In addition to biasing peer selection, the extension allows subscribing volunteers to participate in a monitoring service for the Internet. With over 260,000 volunteer users today (a current growth rate of 200 per day), and distributed in over 150 countries and more than 3000 networks, this system is already the largest-ever monitoring service. On a daily basis, the growing collection of users allow the researcher to record over 100, 000, 000 transfer-rate samples, 1.8 million traceroutes, to over 3.5 million peers spread over 18, 000 ASs.
While there are clearly a large number of important research questions that could be answered based on this dataset, the task of collecting, archiving and analyzing this volume of data is a constant challenge. On a single day, we must handle over 7GB of uncompressed data! To address this challenge, the PI will build the P2P Monitoring System, a platform for data collection, archival and off-line analysis of the passively collected network- and application-level traces collected by the participating hosts. The end goals are (a) to reduce the effort currently dedicated to logistical issues related to the management of this data set in order to focus on exploring the many research paths opened by it and (b) to make possible the sharing and long-term archival of this valuable dataset with the wider community through well-established organizations such as CAIDA.
Intellectual Merit: The framework will enable a wide range of research activities, including: (i) explore a new approach to the design and implementation of large-scale distributed systems based on the reuse of previously gathered measurements, (ii) understand the feasibility of a network early warning system that relies exclusively on information passively gathered from P2P systems, (iii) evaluate the evolution and stability of the social network formed by P2P users, (iv) explore the potential for revealing a more complete Internet connectivity graph through the contribution of hundreds of thousands of vantage points worldwide and (v) investigate approaches to diagnose and enhance the resilience of network to natural disasters.
Broader Impact: Access to end-host views of the network is key to the understanding and characterization of the underlying network and to addressing the needs of new emergent Internet services and applications. The framework developed under this project will free researchers from logistical concerns related to the collection and processing of this collected view, allowing them instead to focus on exploring some of the promising research paths opened by it. The availability of this framework and dataset will also help to significantly enhance education at Northwestern through student training and curriculum enhancement. The availability of high-quality measurements is often key to the success of most research efforts. To open the field of sound, measurement-based research to the broader research community, we will continue an ongoing conversation with CAIDA exploring approaches to facilitate the sharing of this dataset and ensure its long-term archival.