M.K. Gardner, S.W. Ellingson, K.R. Bisset, R.B. Knapp, and E.J. Brown

Collaboration in scientific discourse is as much about building community as it is about data sharing. It is about discipline-specific web portals to ease data discovery and to facilitate access. It is about validated tools for working with the data and about the means for replicating results. It is about data curation and management. It is about controlling access to the data in order to obey regulatory requirements. And it is about the social interactions between scientists. Successful collaborations already address these issues, but in ad hoc ways. New collaborations must tackle the same challenges over and over gain. It would be much more productive if a standard infrastructure existed for building and hosting collaboration environments and communities.

The Advanced Scientific Collaboration Environment and DMZ (ASCED) is a standardized infrastructure, based upon open source components, for building and hosting scientific communities. It is built upon the Science DMZ, an infrastructure for efficiently sharing data, by including a private cloud for hosting collaboration environments. It supports role-based access control, facilitates data management, and provides PIs with tools for abiding by regulatory requirements. It is a standard base upon which scientific communities can be built, allowing future proposals to focus on the specifics of their research. It is also envisioned that ASCED will be used as a platform for educators to engage students at all levels. And it can be a platform for engaging the public, both in the dissemination of results and in participating in research.

Project Report

Collaboration in scientific discourse is as much about building communities as it is about data sharing. Building a community around data implies the existence of an environment for working with the data and for sharing discovery. Most of all, building a scientific community requires supporting interactions between scientists. Successful collaborations already address these issues, but in ad hoc ways. New collaborations must tackle the same challenges over and over again. It would be much more productive if a standard infrastructure existed for building and hosting collaboration environments and communities. The Advanced Scientific Collaboration Environment and DMZ (ASCED) is a standardized infrastructure, based upon open source components, for building and hosting scientific communities. It is built upon the Science DMZ, an infrastructure for efficiently sharing data, by including a private cloud for hosting collaboration environments. It supports role-based access control, facilitates data management, and provides investigators with tools for abiding by regulatory requirements. It is a base upon which scientific communities can be built, allowing future investigators to focus on the specifics of their research rather than the minutia needed to build communities. It is also envisioned that ASCED will be used as a platform for educators to engage students at all levels. And it can be a platform for engaging the public, both in the dissemination of results and in participating in research. The goals of the ASCED project are first to greatly improve the ability of scientists to share large amounts of data with colleagues throughout the world and second to provide a standardized infrastructure upon which to build communities surrounding the data. The sharing of large amounts of data is facilitated by implementing a Science DMZ. A Science DMZ is a special network separate from the institution's network which is specifically designed to transfer scientific data safely, securely, and at high speed. The Science DMZ at Virginia Tech, schematically shown in the figure, provides a direct connection from the high-performance computing resources on campus to high-speed national and international research networks such as Internet2. As an example, before the ASCED Science DMZ the fastest way to transfer large data sets from a radio telescope in the southwest was to mail a set of disks to Virginia Tech. The turn-around time for the disks was nearly two weeks. Utilizing ASCED, the average transfer rate is now less than a day yielding a 15-fold speed up. The faster transfer rate, coupled with utilizing the supercomputing resources at Virginia Tech, has the potential to dramatically accelerate discovery in this field. Other fields will benefit similarly. As discussed above, scientific discourse is as much about building communities as it is about sharing data. The ASCED project facilitates the building of scientific communities by providing standardized infrastructure, in the form of a private cloud, for scientists to use to build virtual collaboration environments for their communities. The private cloud is based on open source software components that can be freely used by other institutions. Support for virtual collaboration environments is designed to minimize friction for scientists. The principle investigator (PI), who is in the best position to know what their community needs, is empowered to configure the virtual environment to meet the needs of the community, including who has access and to what information. The PIs, or their delegates, are also the ones who configure their environment with tools for working with the data. Collaborators access virtual environments through the ASCED web portal. Interest in the ASCED virtual collaboration infrastructure is growing at Virginia Tech. The first collaboration environment primarily contains tools for transferring data. However, other tools are being considered for inclusion, including wikis, blogs, and social media software. As scientists gain more experience with the standardized infrastructure, it is expected that the number and types of tools made available for collaboration will increase. Discussions on how to use virtual collaboration environments to enable citizen scientists to assist with the discovery process and there by increase the understanding and engagement of the public in science are beginning. In summary, much of science depends upon sharing data and upon building communities around the data. The ASCED project is designed to facilitate those goals by implementing a Science DMZ to transfer data rapidly and to provide standardized infrastructure for building scientific communities. Improvements in data transfer rates are already transforming research in data intensive disciplines at Virginia Tech. Further, explorations are ongoing as to the best ways to leverage virtual collaboration environments to build communities and to involve the public in scientific discourse.

Agency
National Science Foundation (NSF)
Institute
Division of Advanced CyberInfrastructure (ACI)
Type
Standard Grant (Standard)
Application #
1245827
Program Officer
Kevin Thompson
Project Start
Project End
Budget Start
2013-01-01
Budget End
2014-12-31
Support Year
Fiscal Year
2012
Total Cost
$291,260
Indirect Cost
City
Blacksburg
State
VA
Country
United States
Zip Code
24061