This research addresses the trustworthy distribution and retrieval of information across the Internet. Specifically, it concerns the distribution of metadata and requests, the matching of requests and metadata, and the retrieval of information corresponding to the metadata. In the Internet today, this functionality is provided by centralized search engines and search indexes. However, such centralized mechanisms can be easily subverted to prevent the distribution and retrieval of information.

The iTrust distribution and retrieval network aims to ensure that individuals cannot be prevented from distributing or retrieving ideas and information across the Internet. In iTrust, source nodes produce metadata that describes their information, and distribute that metadata to nodes chosen at random. Similarly, requesting nodes distribute their requests to nodes chosen at random. Nodes compare the requests with the metadata they hold. If a node finds a match, it returns the URL of the associated information to the requesting node, which then retrieves the information from the source node.

The iTrust distribution and retrieval network is the first to provide an effective fully distributed Internet search that is difficult to subvert. This project will develop infrastructure software for iTrust and a user interface that is convenient and easy-to-use. The technology and source code for iTrust will be freely available on a public Web site. The expected impact and significance of iTrust are that, even if the conventional centralized Internet search mechanisms are subverted, an alternative will exist to protect the free flow of information across the Internet.

Project Report

The Problem and the Solution The free flow of information is one of the basic tenets of liberty. Today, we rely on the Internet to access information. Currently, our trust in the accessibility of information over the Internet and the Web depends on benign and unbiased administration of centralized search engines. Unfortunately, centralized systems rely on one or a few nodes that can be easily subverted to censor information. In this NSF research project, we have developed a distributed and decentralized publication, search and retrieval system, named iTrust. Our initial implementation of iTrust, based on the HyperText Transfer Protocol (HTTP), is most appropriate for desktop or laptop computers on the Internet. However, today, many people use mobile phones to organize their activities and, in many countries, mobile phones are the only computing platform generally available. Consequently, we have developed a version of iTrust for mobile phones using the Short Message Service (SMS). To guard against the risk that both the Internet and the cellular telephony infrastructure are disabled, we have developed a Wi-Fi Direct version of iTrust for mobile ad-hoc networks. Mobile phones or tablets enabled with Wi-Fi Direct can share information directly, without the need for wireless access points or cellular network connections. Thus, with iTrust, users can decide which kind of network is most appropriate for their current needs, and they can share information without reliance on, or interference from, a third party. Figure 1 illustrates the different kinds of networks over which iTrust operates. The Basic Idea of iTrust The iTrust network is an unstructured network in which all nodes are equal. Some of the nodes, the source nodes, produce information, and make that information available to other participating nodes. The source nodes also produce metadata that describes their information. A source node distributes the metadata, together with the address of the information, to a subset of the participating nodes chosen at random. Other nodes, the requesting nodes, request and retrieve information. Such a node generates requests that contain keywords, and distributes its request to a subset of other participating nodes chosen at random. Nodes that receive a request compare the keywords in the request with the metadata they hold. If a node finds a match, which we call an encounter, the matching node returns the address of the associated information to the requesting node. The requesting node then uses the address to retrieve the information from the source node. Figure 2 illustrates the basic concept of iTrust. The Intellectual Merit The intellectual merit of the iTrust publication, search and retrieval system lies, in part, in its distributed and decentralized nature, and its objective to support the sharing of information without fear of censorship. It also lies in its probabilistic nature and its distribution of both the metadata and the requests (queries) to randomly chosen nodes in the network. The intellectual merit of iTrust also lies in its statistical algorithms for detecting and defending against malicious nodes in the iTrust network. Using the chi-squared statistic and the exponential weighted moving average, the novel statistical detection algorithm estimates the proportion of subverted or non-operational nodes in the iTrust network. The novel defensive adaptation algorithm then increases the number of nodes to which the requests are distributed to maintain the same probability of a match as when all of the nodes are operational. The intellectual merit of iTrust also lies in its adaptive membership protocol, which exploits the messages that are already being sent to distribute metadata or requests, thus reducing the need for additional messages. The novel adaptive membership protocol estimates the membership churn, i.e., the nodes joining or leaving the membership, based on the responses that a node receives to its requests. The node then dynamically adjusts the number of nodes to which it sends metadata and requests to compensate for the changing membership. The Broader Impacts The most significant broader impact of iTrust is societal. The free flow of information is the primary determinant of a free and democratic society. The free flow of information discourages small groups of people from trying to abuse the government, the economy or the environment for their own personal gain. Many research projects provide benefits to individuals, but it is rare for a research project to provide benefits to society as a whole. More specifically, the benefits of iTrust include the ability to create mobile ad-hoc networks using Wi-Fi Direct, which can be of substantial benefit in less developed countries of the world. Distributed membership algorithms have benefits for many modern networked computer systems. The ability to detect malicious attacks indirectly by statistical inference has substantial benefits for many distributed systems. Additional information on iTrust, including publications, presentations, source code and documentation, can be found on the iTrust Web site http://iTrust.ece.ucsb.edu

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Network Systems (CNS)
Application #
1016193
Program Officer
Angelos Keromytis
Project Start
Project End
Budget Start
2010-08-01
Budget End
2013-08-31
Support Year
Fiscal Year
2010
Total Cost
$531,732
Indirect Cost
Name
University of California Santa Barbara
Department
Type
DUNS #
City
Santa Barbara
State
CA
Country
United States
Zip Code
93106