The ways in which scientific knowledge is discovered and communicated openly online have changed dramatically over the last three decades, and especially in the last ten years. Multiple factors have contributed to massive growth in this area. The production of vast amounts of data and literature, coupled with the rapid development and advancement of the processing, storage, and communication technologies have made it both possible and necessary to use more computing technologies and methods in the research process. Computer-mediated discovery processes are being increasingly adopted and used by different scientific communities to explore and study a wide range of domains of knowledge, including all scientific disciplines. Computational scientific knowledge discovery is thus rapidly becoming practiced as a new form of scientific inquiry in the virtual environment, building upon and supplementing the research based on theoretical, experimental, and observational methods that preceded it. At the same time, many new models of open science have been developed that take much greater advantage of the capabilities of digital networks. When integrated together online, various types of open knowledge resources are forming incipient information ?commons? and knowledge environments, which can derive more value from the public investments in research. Of particular interest to this proposed project, such mechanisms can enable more efficient and effective applications of digital scientific knowledge discovery tools and techniques. A deeper understanding of the opportunities and barriers to such processes has the potential to accelerate the progress of scientific research, to support U.S. national competitiveness and increased productivity in information-intensive areas of research and its applications. An improved understanding of these issues also can enable research managers and policy makers to make much more informed decisions about the research enterprise, and to explain more clearly to policymakers and to the general public how the public investment in research and digital technologies advances broader socioeconomic interests.
The Future of Scientific Knowledge Discovery in Open Networked Environments Report from a National Workshop by Paul F. Uhlir Director, Board on Research Data and Information National Research Council On March 10-11, 2011, the Board on Research Data and Information (BRDI) of the National Research Council in Washington, DC held a national workshop on The Future of Scientific Knowledge Discovery in Open Networked Environments. Digital technologies and networks are now part of everyday work in the sciences, and have enhanced access to and use of scientific data, information, and literature significantly. They offer the promise of accelerating the discovery and communication of knowledge, both within the scientific community and in the broader society, as scientific data and information are made openly available online. The phrase "scientific knowledge discovery in open networked environments" is subject to many definitions. For purposes of this project, the focus was on computer-mediated or computational scientific knowledge discovery, taken broadly as any research processes enabled by digital computing technologies. Such technologies may include data mining, information retrieval and extraction, artificial intelligence, distributed grid computing, and others. These technological capabilities support computer-mediated knowledge discovery, which some believe is a new paradigm in the conduct of research. The emphasis was primarily on digitally networked data, rather than on the scientific, technical, and medical literature. The meeting also focused mostly on the advantages of knowledge discovery in open networked environments, although some of the disadvantages were raised as well. The workshop brought together a set of stakeholders in this area for intensive and structured discussions. The purpose was not to make a final declaration about the directions that should be taken, but to further the examination of trends in computational knowledge discovery in the open networked environments, based on the following questions and tasks: 1. Opportunities and Benefits: What are the opportunities over the next 5 to 10 years associated with the use of computer-mediated scientific knowledge discovery across disciplines in the open online environment? What are the potential benefits to science and society of such techniques? 2. Techniques and Methods for Development and Study of Computer-mediated Scientific Knowledge Discovery: What are the techniques and methods used in government, academia, and industry to study and understand these processes, the validity and reliability of their results, and their impact inside and outside science? 3. Barriers: What are the major scientific, technological, institutional, sociological, and policy barriers to computer-mediated scientific knowledge discovery in the open online environment within the scientific community? What needs to be known and studied about each of these barriers to help achieve the opportunities for interdisciplinary science and complex problem solving? 4. Range of Options: Based on the results obtained in response to items 1–3, define a range of options that can be used by the sponsors of the project, as well as other similar organizations, to obtain and promote a better understanding of the computer-mediated scientific knowledge discovery processes and mechanisms for openly available data and information online across the scientific domains. The objective of defining these options is to improve the activities of the sponsors (and other similar organizations) and the activities of researchers that they fund externally in this emerging research area. The first day of the 2-day meeting consisted primarily of invited expert speakers, who addressed tasks 1-–3. This was followed immediately by a discussion on the second day to leverage the expertise of the invitees to revisit the first 3 tasks and to address task 4, based on the discussions on the first day. The slides presented by the speakers at the meeting are posted on the National Academy of Sciences’ Board on Research Data and Information Web site at http://sites.nationalacademies.org/PGA/brdi/PGA_060424, and the entire meeting was webcast. This report has been prepared by the workshop rapporteur as a factual summary of what occurred at the workshop. The committee’s role was limited to planning and convening the workshop. The views contained in the report are those of the individual workshop participants and do not necessarily represent the views of all workshop participants, the steering committee, or the National Academies. It can be argued that too much time has passed since the meeting took place, and that the results of this effort are not timely enough to provide insight. In fact, the elapsed time between the events reported here and this report has provided time to assess the issues with more care. We are grateful to the National Science Foundation (NSF) for support of this project under NSF Grant Number 1042078. The full report is available openly and free-of-charge online on the Board’s website at www.nas.edu/brdi and on the National Academies Press site at http://www.nap.edu.