This award is funded under the American Recovery and Reinvestment Act of 2009 (Public Law 111-5). This research examines practices of software development, co-creation and sharing in collaborative scientific research based on a sample of multiple virtual organizations using the Open Science Grid. Almost every workflow that generates scientific results involves software, from configuration and control of instruments, to statistical analysis, simulation and visualization. Creating and maintaining software is a significant activity in scientific laboratories, including science and engineering virtual organizations. Success depends upon overcoming both the challenges of collaboration and the complex coordination problems of distributed software development.
A key challenge is that while science is collaborative, it is not selfless; scientists face competitive pressures throughout their careers and maintaining ongoing control over the software they write may be related to their career success. Consequently, common software development models like traditional open source projects may be less appropriate for scientific software as alternative models drawn from open software ecosystems, such as Eclipse, that support collaboration between groups that are also competitors. Without adequate understanding of the complex issues of both coordination and incentive challenges, scientific funding agencies cannot provide clear guidance on appropriate policy to promote and sustain effective software development in collaborative science.
This research extends theories of architectural alignment in distributed software development and hypothesizes a central role for technical architectures in establishing the framework within which coordination, cooperation, and competition take place. In particular we hypothesize an enabling role for architectures that are appropriately aligned to support collaboration, particularly as incentives change over time. To examine existing practices and these hypotheses the research team will collect data through interviews, the Open Science Grid's software usage data, software source code and development repositories over three years.
The results will build understanding of how scientists create and share their software. Such understanding will help to improve collaboration practices in scientific research as well as helping science-funding bodies support scientists through policy, guidelines and education on software development in funded scientific research.