Increasingly sophisticated science depends on ever more complex software. Software is increasingly vital across all scientific fields for functions as diverse as analyzing data, simulating phenomena, and controlling instruments. Software, more than simply a supporting service, can be a source of innovation and can enhance science by increasing its transparency, reproducibility, transferability and scale. Currently, however, there is little scientific understanding of the contribution of software work towards progress in science; it is often not even clear what software was used in advancing science and who ought to receive credit for that software's contribution. Lacking this citation mechanism, young scientists are implicitly discouraged from the work needed to build broadly useful software. Collaboration to improve existing software is also unintentionally discouraged. Releasing a package can create a unfortunate outcome, where a package's success brings support and maintenance burdens that overwhelm its creators, stalling emerging projects and discouraging future sharing.
Both understanding the role of software work in science and overcoming the obstacles slowing its progress depend critically on knowing what software is actually used, how the use of various packages is orchestrated in scientific projects, and where resources can be most effectively deployed to facilitate software development, evolution, and support. In order to continuously acquire such knowledge, this project is designing and building the Scientific Software Network Map. This map will measure the scientific-software ecosystem by collecting data on which software tools have been used in the science leading to publications, the network of dependencies and complementarities from that software to other software packages, and how software use evolves over time. The project is working with scientists in software-heavy fields to research and design data-collection techniques, gather data, and design appropriate analyses. This project will also provide an initial foundation for a scientific understanding of the role of software in science and scientific innovation.
Broader Impacts: The results of the project will improve the practical conduct of software-supported science as well as informing science policy decision-making related to software's application in science, research, and innovation projects. The publicly available Map that results will be more broadly useful in three mutually reinforcing ways. First, it will be a resource for software-contributing scientists to demonstrate the usefulness and impact of their work, providing a foundation for their motivated, sustained contribution and improving the sustainability of scientific software projects. Second, it will function as an information source for other scientists to know what software is being used in their field, generating common knowledge that reduces costly reimplementation and helping communities coalesce towards shared platforms. Third, it will help science funders identify evolving issues and opportunities in the software ecosystem -- such as the challenge of rapid adoption driving heavy but unanticipated support burdens, or early alerts about communities coalescing around nascent cores which, with timely, focused funding support, can be turned into the software platforms that will drive future scientific innovation.