This project will transition the commercial system DataTurbine into an open source software system, while specifically enhancing the middleware across a number of areas. DataLogger supports reliable data transport and the integration of sensors and remote instruments into Cyberinfrastructure through a set of services supporting data streaming and management. Work areas include visualization, single sign-on, adding datalogger drivers, benchmarking, porting to 64-bit architectures, and other enhancements. Intellectual merit lies in the realization of bringing sensors into the CI continuum and insights gained in applying this technology to observing systems. Broad impact includes direct operational integration for a number of NSF observing systems, including NEON, NEES, and CUAHSI.
Environmental science and engineering communities are now actively engaged in the early planning and development phases of the next generation of large-scale sensor-based observing systems. These systems face two significant challenges: heterogeneity of instrumentation and complexity of data stream processing. Environmental observing systems are complex distributed systems. They incorporate instruments from across the spectrum of complexity, from temperature sensors to acoustic Doppler current profilers, to streaming video cameras, and to synthetic aperture radar. They operate under a variety of networking conditions, including wired and wireless, persistent and intermittent. They have stringent requirements on data timeliness and integrity. Managing these instruments and their data streams presents serious challenges in systems development and operations. The Open Source DataTurbine (OSDT) Initiative was launched in October 2007 with a two-year grant from the National Science Foundation Office of Cyberinfrastructure to address these challenges through the publication, enhancement, and promotion of the DataTurbine streaming data middleware [Tilak07]. More information about OSDT Initiative is available at: http://dataturbine.org/ The NSF award funded the core activities needed to build an open-source software community around the DataTurbine middleware. There were three areas of funded activities: (1). Publish DataTurbine as an open source software product and provide developer support, including documentation, bug tracking, collaboration tools, and experimental facilities. (2). Enhance the code base, including porting DataTurbine to additional compute platforms, writing additional device drivers, and testing and tuning. (3). Build an active open source community through education, outreach, recruitment, and technical support. The OSDT Initiative has been successful in these activities. A core component of our project has been professional code management of the Open Source DataTurbine software. This includes code hosting, a download server, a bug tracking system, a developers’ discussion forum, a system management blog, and a project web site. These are essential elements of a successful open source software project and represent the foundation for the active community of DataTurbine developers and users. We released DataTurbine as an open-source product (Apache 2.0 license) on the Google Code site: http://dataturbine.googlecode.com/. The OSDT Initiative has 50 registered members, 893 archived messages on the mailing list, 152 downloads of the OSDT source code, and 2100 downloads of various binary versions. Through code developments and community support, the Initiative serves as the catalyst and incubator to numerous science and engineering groups. Since its inception, the OSDT Initiative has demonstrated broad impact on a variety of projects and communities, across a wide range of applications – from lakes and coral reefs, to civil infrastructure and smart buildings, to airborne science and aeronautics [Fountain09, OSDT-Workshop]. The result is an international community of scientists and engineers who share a common interest in real-time streaming data middleware and applications and are collaborating to produce useful middleware and successful deployments (www.dataturbine.org). Community members are drawn from academia and industry, and represent a variety of science and engineering domains, including The Global Lake Ecological Observatory Network, (GLEON: www.gleon.org/), The Coral Reef Environmental Observatory Network (CREON: www.coralreefeon.org/), MoveBank: Integrated Database for Network Organism Tracking (MoveBank: www.movebank.org/), The Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUASHI: www.cuahsi.org/), The Long Term Ecological Research Network (LTER: www.lternet.edu/), The Network for Earthquake Engineering and Simulation (NEES: www.nees.org), The National Center for Ecological Analysis and Synthesis (NCEAS: www.nceas.ucsb.edu/), Great Barrier Reef Ocean Observing System (GBROOS: http://imos.org.au/gbroos.html), and The Pacific Rim Application and Grid Middleware Assembly (PRAGMA: www.pragma-grid.net/). The activities carried out under this project resulted in significant intellectual merit to both the cyberinfrastructure research community and to a variety of application domains. The insights and design principles developed under this project advanced the state of the art in systems engineering for large-scale distributed systems, including the planned national and international environmental observing systems. The lessons learned through the application of Open-Source DataTurbine to the targeted application domains will provide guidance to domain scientists on employing cyberinfrastructure to address their specific scientific missions. The resulting open-source cyberinfrastructure product complements the existing NSF portfolio of tools and resources. The broader impact of the proposed activities was the creation of the next generation of sensor-based applications through technology development and dissemination, collaborations and training. The cyberinfrastructure products, the training courses, and the close collaborations with domain scientists have resulted in a broad and positive impact on the scientific community, and through their science, a broad and positive impact on students, citizens, and policy makers.