This proposal presents an opportunity to work on the problem that scientists have access to continuously growing data repositories across economic and geographic boundaries. However, both individual innovation and the formation of rich collaborations still rely on traditional research and social mechanisms. While virtual organizations help with access to data environments among groups, members must still proactively seek to collaborate. The difficulty of sharing analysis tools, and the lack of understanding of how such tools are used, create friction that impedes extracting the greatest benefit from data and its usage. If the scientific community can formalize collection of User Data Interaction (UDI) data and develop actionable characteristic behavior patterns from it, the friction can be relieved and scientists can be connected in behaviorally meaningful ways that are not currently imagined. In this proposal is discussed the opportunity for working on the problem. Data is the lifeblood of science. Recent funding opportunities have fueled support for uploading, archiving, and managing data in more formal and standard ways. However, the actual use of data through data exploration tools is still a highly variable process. Interactive data exploration tools provide the opportunity to record researcher interactions during the exploration process. The pattern of interactions such users undertake while searching, exploring, and using data is a largely unexploited opportunity for new connections and new learning that could help researchers identify useful exploration modes or gaps, and even new collaborative partners that could increase interactions and innovation. Such data about how users explore data are here termed, ?User-Data Interaction (UDI) Data.? Creating cyberinfrastructure building blocks to support a standard for collecting UDI Data, community development of data exploration tools, and the exploration of UDI data could fundamentally change the practice of science and engineering. Having such data and analysis tools hosted within a shared cyberinfrastructure could also allow for unprecedented study of their use and effectiveness. The goal of this conceptualization research will be to define an implementation project for the DIBBs program. To achieve this goal, the approach will be to understand the kinds of data analysis tools that various user communities currently use, those that they would like to create and share, and to explore the ensuing UDI data that could be collected and leveraged. A data source will be characterized as any service into which a user can specify a query and receive a semi-structured result. By way of example, this may include an online database with which users interact through forms, a graphical interface to a data cube, or even an online simulation tool. The proposing team has access to three such toolkits in use by thousands of users today (Rappture Toolkit, iKNEER, and DataView) to study as sources of analysis tools and UDI data. Specifically, access to the developers of these systems will provide information about how such systems could generate UDI data and what its important features may be. Having built an understanding from active communities and small group discussions, the final step of information gathering will be two larger discussions held in conjunction with two events: HUBbub 2013 and an NSF S2I2 conceptualization project meeting. The Intellectual Merit: This research will identify the social and technological roadblocks to sharing data analysis tools, and the transformational potential of UDI data. The intellectual merit of this activity will be an evidence-based blueprint for a cyberinfrastructure environment that will automatically gather UDI data, develop patterns from those data, and facilitate amplified discovery and collaboration based on those patterns in a way that acceptably balances efficacy and privacy. Collaborations will increase and will be of greater substance. Broader Impacts: This work will pave the way for new scientific connections among researchers, educators, and students that will accelerate research and innovation. The difficulties that underrepresented groups inherently face in traditional methods of establishing scientific collaborations will be bridged by an implementation of the proposed work, allowing everyone to connect to tools and other researchers?not solely by established reputation, but based on their interactions with data. Because the work is not specific to one virtual organization or data tool, it will have a broad reach across diverse scientific communities that use data and data analysis tools.

Agency
National Science Foundation (NSF)
Institute
Division of Advanced CyberInfrastructure (ACI)
Type
Standard Grant (Standard)
Application #
1255781
Program Officer
Robert Chadduck
Project Start
Project End
Budget Start
2013-03-15
Budget End
2015-05-31
Support Year
Fiscal Year
2012
Total Cost
$99,718
Indirect Cost
Name
Purdue University
Department
Type
DUNS #
City
West Lafayette
State
IN
Country
United States
Zip Code
47907