The majority of NSF-funded research occurs in small and medium-sized laboratories (SMLs) that often comprise a single PI and a few students and postdocs. For these small teams, the growing importance of cyberinfrastructure and its applications in discovery and innovation is as much problem as opportunity. With limited resources and expertise, even simple data discovery, collection, analysis, management, and sharing tasks are difficult. An unfortunate consequence is that in this "long tail" of science, modern computational methods often are not exploited, much valuable data goes unshared, and too much time is consumed by routine tasks. To date, research investments in science cyberinfrastructure have disproportionately emphasized big science projects, providing tools for use by IT staff and technology savvy researchers rather than complete applications consumable by end users. This project's goal is to lay a foundation for a more balanced research agenda by focusing exclusively on the needs of SMLs.
This Software Institute Conceptualization project aims to determine whether these obstacles to discovery and innovation can be overcome via the use of software as a service (SaaS) methods. Such methods have proven immensely effective for small and medium businesses due to their ability to deliver advanced capabilities while streamlining the user experience and achieving economies of scale. To determine whether similar benefits can apply for SMLs, the project team will engage with multiple science communities to identify science practices, match science practices against candidate SaaS offerings, and evaluate business models that could permit sustainable development of those offerings. The outcome of this process is intended to be a compelling and competitive strategic plan for an NSF Software Innovation and Sustainability Institute that both meets immediate needs of the initial science communities and provides a basis for a new, more cost-effective method of addressing cyberinfrastructure needs across all NSF directorates.
Major Activities Our activities at UCLA are to study two major earth science projects: the Center for Embedded Networked Sensing (CENS) and the Center for Dark Energy Biosphere Investigations (C-DEBI). We focused our work on assembling data and real-world examples illustrating the issues and opportunities for Software as a Service (SaaS) in Small-and-Medium Sized Laboratories (SMLs). Our team has taken a two-pronged approach to meet our goals. The first approach consists of establishing connections to SML communities by participating in workshops and other events; we are identifying the challenges they face in data management that might be addressed by better tools. The second approach consists of building longer-term relationships with specific SMLs to study them in depth. Workshops Center for Dark Energy Biosphere Investigations (C-DEBI) All-Hands meeting, Monterey Bay, CA (Oct 22-23, 2012). C-DEBI is an NSF Science and Technology Center (STC) studying subseafloor microbial life. In October, IELTR team members from UCLA (Peter Darch, who is supervised by Christine Borgman) and USC (Carl Kesselman) joined the Dark Energy Biosphere community at the C-DEBI All-Hands meeting. The C-DEBI community is comprised of small- and medium-scale, highly- interdisciplinary research teams. The workshop provided opportunities to learn about unmet cyberinfrastructure needs of this scientific community. Our team continues to observe C-DEBI’s data practices as a means to understand their technical and social requirements. NSF EarthCube Ocean ‘Omics Workshop, Catalina Island, CA (Aug 21-23, 2013). Peter Darch attended a workshop arranged by the NSF as part of its EarthCube project. This is one of a series of workshops aimed at targeted end-users, in this case scientists studying microbes in the ocean and beneath the seafloor. This workshop presented EarthCube to the target community, and also involved sessions aimed at gathering requirements from these users. The American Geophysical Union (AGU) Annual Meeting, San Francisco, CA (Dec 9-13, 2013). The AGU meeting is the primary conference attended by scientists studying subseafloor microbial life. At this meeting, Peter Darch and Rebekah Cummings, both under the supervision of Christine Borgman, presented the results from the first year of the C-DEBI case study with a paper presentation, "Buried deep: How data about subseafloor life becomes dark and why," and a poster "Between land and sea: divergent data stewardship practices in deep-sea biosphere research." Datasphere at the Biosphere II workshop, Oracle, AZ (May 5-6, 2014). This workshop was organized as part of our project for conceptualizing an Institute for Empowering Long Tail Research. It brought together 19 researchers in the domain of biodiversity. The purpose of this workshop was to seek feedback from the earth sciences community on whether software as a service could offer solutions that remedy the data challenges they face in their research. Through our interactions at the workshop, we were able to draw generalized conclusions by highlighting observed similarities between the data challenges that were discussed at the workshop and those faced by other scientists working at SMLs affiliated with CENS and C-DEBI in recent past. Long-term studies and collaborations Center for Embedded Network Sensing (CENS). Rebekah Cummings, supervised by Christine Borgman, analyzed interviews conducted during the period 2002-2012 with members of the CENS collaboration, an NSF Science and Technology Center that ran from 2002-2012. These interviews had previously been conducted by members of Borgman's Knowledge Infrastructures research team in the Department of Information Studies at UCLA. We used our knowledge of CENS to compare their data challenges with the data needs of more recent collaborations such as C-DEBI. We analyzed further a number of requirements to inform the development of an Institute for Empowering Long Tail Research. Center for Dark Energy Biopshere Invesitgations (C-DEBI). Our team (Peter Darch and Rebekah Cummings, both supervised by Christine Borgman) interviewed nearly 50 C-DEBI- affiliated personnel and analyzed relevant documents. We also conducted extensive observational work, including eight months (10-20 hours per week) in a laboratory at University of Southern California, accompanying scientists on a three-day research trip to Catalina Island, and attending the C-DEBI All Hands’ Meeting and other conferences and seminars. This observational work enhanced our understanding of the scientists' work, beyond the knowledge obtained via interviews and document analysis. As a result, our findings concern technical, cultural, and social dimensions of data practices that are essential to consider in the design of effective solutions. The researchers studied have highly divergent data handling practices that limit their options for generic software tools. Our team has established a long-term engagement with the scientific communities under study, and we plan to continue our observational work on data practices and needs in long-tail science. We have built long-term collaborative relationships with C-DEBI and our partners on this grant, which we expect to lead to future partnerships and joint projects.