This collaborative proposal describes development work on the software system known as the Globus Toolkit spanning a 5-year period. This development work, labeled as "Community Driven Improvement of Globus Software", or CDIGS, promises generally to respond to the needs of the scientific user community in adding enhanced functionality and performance to an existing and well-used software base exceeding 2.5 million lines of code. The proposal guides the reader first through a primer on the Globus user community and the Globus technology and components. The detailed project plan focuses on the first 24 months and cites specific work in the areas of security, data management, execution management, information services, and common runtime. Their methodology on software engineering practices describes how they plan to develop, integrate, and make available new capabilities across major GT code releases. The proposal also describes in several sections their approach to user engagement and community support, leveraging multiple channels of engagement ranging from meetings and community events to IRC and email list forums. One section covers the project team and management plan, where the roles of well known and veteran GT developers and designers are defined. The project team consists of about 14 FTE covering full-time software engineering, coordination, management, and documentation. The management plan includes the creation of a Technology board that oversees software engineering activities, an External Advisory board comprised of industry/academic/federal experts and stakeholders, and a Globus Alliance board charged with defining technical direction and providing a direct tie with the Globus Alliance organization. The proposal included 102 letters of support, including letters from most of the significant grid-related projects worldwide.
Intellectual Merit: Despite the term "research" appearing in the title, the proposal is clear about being a development project. The proposal claims that new knowledge will be gained, spanning areas of software engineering, in the pursuit of delivering Globus technology to the scientific users in grid environments. Broader Impact: The proposal identifies throughout the dependence on GT by a slew of major CI initiatives. Continued support and new development of Globus will thus support and further enable these existing distributed environments, as well as smooth the way for new grid environments to form in the support of distributed science, engineering, and education. The proposal makes a direct case for CDIGS supporting production science.
Research no longer fits old stereotypes of white coats and long solitary hours amid bubbling beakers. A typical research project today is more likely to involve multidisciplinary teams and be concerned with data analysis and computer simulation rather than laboratory experiments. New collaborative and computational approaches are enabling rapid advances in the physical, biological, and social sciences, and in engineering. For such new approaches to be effective, researchers require software that enables distributed teams to share computing, storage, data, software, instrumentation, and other resources. The Community Driven Improvement of Globus Software (CDIGS) project, funded by the National Science Foundation from 2006 to 2011, has enabled a team at the University of Chicago and the University of Southern California to support and enhance the Globus Toolkit, a widely used software of this sort. Globus software addresses three vital problems in 21st Century distributed science: controlling access to resources, managing data movement, and enabling the use of remote computation. CDIGS-developed software is used across the US and around the world within classrooms and research laboratories, individual laboratories and global collaborations, and in essentially every research domain supported by the National Science Foundation. For example: The Laser Interferometer Gravitational Wave Observatory use Globus software to distribute data collected at its observatory sites in the states of Washington and Louisiana to researchers across the US and Europe. The TeraGrid, NSF’s flagship "cyberinfrastructure," uses Globus software to support "science gateways" that enable thousands of researchers to use TeraGrid supercomputers in their research. Campus grids from California to Texas use Globus software to provide on-demand computing services to faculty and students. The Earth System Grid uses Globus software to enable more than 20,000 registered users to access large quantities of climate simulation data. Neuroscientists use Globus software to enable sharing and collaborative analysis of brain imaging data. To support these and many other uses and users of Globus software, the CDIGS team of software architects, software engineers, and support engineers has undertaken four primary tasks over the past five years: Evolve and enhance Globus functionality, performance, scalability, and robustness. Improve usability and manageability so as to decrease the cost and complexity of deploying, operating, and using Globus infrastructure. Support major NSF users and communities. Expand the Globus community. One quantitative measure of the project’s success is the continued growth in usage of Globus software. For example, 2011 metrics show more than 10 million files being transferred per day using Globus GridFTP, and more than 15 millions jobs executed per month on the Open Science Grid, another national cyberinfrastructure, using Globus GRAM. CDIGS also took first steps towards a further broadening of the Globus user community, via the development of Globus Online, a new system that provides sophisticated cyberinfrastructure capabilities via software-as-a-service ("cloud") methods, accessible to anyone with a network connection and Web browser. Recognizing the vital enabling role that computing plays in 21st Century research and education, the National Science Foundation has proposed an ambitious Cyberinfrastructure Framework for 21st Century Science and Engineering (CF21). Globus software, and the lessons learned from developing and using that software, will serve NSF well as it embarks on the CF21 mission.