The following is a proposal for a project in Semantic Citation Analysis--an approach to science citation research promising novel insight into the organization and transmission of scientific knowledge by incorporating psycholinguistic and neo- Darwinian theory. The goal of the project is to establish a new method for modeling cultural differentiation and transformation using textual data drawn from the published abstracts of a major science citation index. Using a computational procedure called Latent Semantic Analysis, the meaning of science articles' abstracts will be quantitatively represented, allowing their semantic similarity to other articles in the index to be analyzed and graphically illustrated using multi-dimensional scaling. Doing so will allow scientific fields to be evaluated as discrete cultural groups within science on the basis of their latent semantic similarities. The result will be a schematic illustration of the structure of science on the basis of actual scientific work produced. Analyses based on this procedure will then be able to delineate the structure and overlap of the sciences.
Intellectual merit. In addition to the valuable insight into the differentiated structure of science that this approach will afford, the project's intellectual merit stems from the answers it promises to longstanding questions in the area of cultural transmission as exemplified by scientific citation. In particular, it will use variables derived from the Latent Semantic Analysis to test the fundamental tenets of Evolutionary Culture Theory, a promising multi-disciplinary movement whose development has been impeded by a lack of concrete research. This paucity has stemmed not from lack of interest in the theory but rather a vexing yet inexorable problem with operationalizing it--namely the difficulty in finding suitable units of analysis and analytical tool. The proposal represents one solution to these problems derived from the culmination of a long line of thinking about scientific progress in terms of evolutionary mechanisms and population biology. In doing so, it will test the hypothesis that cultural transmission is an evolutionary process amenable to a population thinking approach; and reveal the degree to which scientific success is a function of population variables describing the density and diversity of semantic space occupied by articles and their paradigmatic themes. Because the approach can be integrated into current network citation research and is extensible to other domains and fields, it has the potential to evolve into a distributed, progressive research program.
Broader Impacts. With regard to the broader impacts of the project, there are two notable practical applications of the proposed research. First, representing science as a differentiated cultural system can help inform university and grant administrators by revealing the latent semantic structure of scientific space. At present, scientific work is categorized explicitly and funded on the basis of assumptions regarding the distribution of scientific research--assumptions that may be erroneous or outdated. Using Latent Semantic Analysis to examine the actual distribution of scientific resources on the basis of semantic content can overcome these limitations and allow administrators to properly allocate scientific resources and structure academic programs. Second, it can also aid multi-disciplinary research by offering the prototype for a semantic search capability, overcoming current limitations in scientific document searches, which search on the basis of the lexical terms; using Latent Semantic Analysis would enable searching on the basis of semantics, regardless of a lexical match. This will be especially useful to scientists working in multi-disciplinary areas, where terminological differences across fields may impede access to information, but it can also increase the public's access to information by lowering terminological specificity needed to find scientific documents.