This project, developing an instrument designed to test new network protocols and data services for long haul, high performance network called the Teraflow Testbed (TFT), integrates distributed clusters of workstation at four locations (Amsterdam, Geneva, Chicago-StarLight, and Chicago-UIC) using advanced 10 Gbs photonic networks and relying on both layer 2 optical switching and layer 3 routers. The project aims at supporting development of key network technologies important for the next step in high performance, data intensive computing. Research ranges from both low-level network protocol to offering high level data services. TFT will consist of distributed nodes over two continents that can transmit, process, and mine very high volume data flows (teraflows). The testbed will enable the development of new network protocols and innovative data integration and data mining services for teraflows. The work involves the design of a new class of applications that move not only the queries and computation, but the data when required. Subsequently, testing of the protocols and services for traditional routed networks as well as lambda grids will take place. The following three specific research activities in the general area of lambda-grids (posits collections of plentiful computing and storage resources richly interconnected by dedicated dense wavelengths division multiplexing (DWDM) optical paths) will be conducted: High Performance Network Transport Protocols for Teraflows, High End-to-End Performance for Teraflows, and High Performance Data Services for Teraflows. The first supports the development of new network protocols to provide higher bandwidth utilization and good transport performance for networks with high bandwidth-delay product optical network links. Based on previous work on SABUL, a rate based reliable transport protocol with high bandwidth delay, based on UDP (for data) and TCP (for control flow), a UDT protocol is proposed to achieve high effective throughput and still provide fairness for competing teraflows. The second supports the integration of local input-output systems with the very high data rate network protocols. Proposing the integration of an experimental system of intelligent high-speed disks connected to a cluster with the TFT, methods for relaying control information directly from the parallel I/O system to the rate control algorithm in the new protocol will be investigated to maximize overall performance between remote parallel disks. The third develops high performance data services for mining teraflows that use a SOAP/XML based control channel and a separate data channel. This activity builds a distributed peer-to-peer storage system for the data which teraflows can easily access and identifies data mining primitives to filter and process the teraflow.
Broader Impacts: TFT is expected to impact homeland defense, business continuity, and disaster recovery technologies. Post docs, graduate and undergraduate students will partake in the research. Tutorials on high performance data transport will be given at technical conferences.