Atmospheric science uses data assimilation, a process by which information from disparate sources is combined to produce a representation of the state of the atmosphere at a particular time. These sources include traditional meteorological instruments and computer simulation models, which use observational data collected near the specified time. The data are synthesized in complex ways with one or more simulations of conditions near that time. The resulting representation is known as an analysis. Analyses are produced in this way on a daily basis, and they serve as initial conditions for weather forecasting models. Data assimilation is also performed retrospectively using archives of observational data to produce reanalysis datasets that span several decades. These reanalysis datasets have become a staple of weather and climate research; they are used to investigate atmospheric dynamics, to evaluate climate models, and to make quantitative estimates of recent climate change. Since the products of data assimilation are often used in the roles of traditional observational data, some concerns seem pertinent. Are some of these uses problematic? Or are the differences between traditional observational data and the products of data assimilation not very deep? What place should data assimilation have within contemporary philosophical research on scientific method? These are the overarching questions that motivate this project. The first part of the project will explore the extent to which data assimilation and its products can be reconciled with core concepts in the philosophy of science. Specific questions that are to be addressed in the project include the following. Can data assimilation systems (or perhaps the simulation models they incorporate) be considered observing instruments in their own right, as some scientists have suggested? Are analyses and reanalyses really so different from other data models, which are theory-laden in various ways? If highly accurate and reliable data assimilation systems could be produced, would they deliver measurements of atmospheric conditions in regions where no traditional observational data were collected? The second part of the project will evaluate current uses of data assimilation products (in the roles of traditional observational data) from a methodological point of view. Key questions to be addressed are the following. Does the use of reanalyses to evaluate climate models involve an unacceptable circularity, given that climate models share many assumptions with the simulation models used in reanalysis? More generally, on what grounds can inferences from data assimilation products to conclusions about the real atmosphere (such as those having to do with the causes of atmospheric phenomena or the rates of atmospheric warming) be justified, given that data assimilation products are produced in response to a lack of traditional observational data?

Intellectual merit The project aims to contribute to both philosophy of science and atmospheric science. In philosophy of science, it will advance understanding of the ways in which computer simulation is changing scientific practice and complicating the theory of knowledge associated with it, currently a major area of interest. In doing so, it will also re-examine some core concepts of the discipline, such as measurement. For atmospheric science, the project will address fundamental methodological questions about data assimilation that have not received the attention they deserve, despite their direct relevance for practice.

Broader impacts Results of the project will be disseminated widely in journal articles and in talks, targeting both philosophers and atmospheric scientists. More accessible presentations will also be developed for undergraduate students in meteorology and philosophy. In addition, some of the research has the potential to benefit society, by improving evaluation of the climate models whose climate change projections are an important input to current environmental policy discussions.

Special Note: This project is jointly funded by STS and CLD (the Climate and Large-Scale Dynamics Program).

Project Report

Overview Computer simulation is an important part of contemporary scientific practice. In meteorology and climate science, computer simulation is used not only to make predictions, but also to develop datasets that are used as observations of the atmosphere and climate system. These datasets, which are produced from a combination of conventional observations and simulation-based forecasts, are known as reanalysis datasets. This project explored (i) how reanalysis datasets relate to more familiar observations and measurements and (ii) whether there are pitfalls to look out for when using reanalysis datasets as observations. Research findings With regard to (i), it was found that reanalysis datasets are not as different from familiar observations and measurements as one might think; like reanalysis datasets, many familiar observations and measurements are produced with the help of calculations informed by theories and models. At present, however, reanalysis datasets have not been subjected to rigorous calibration procedures in the way that many familiar observations and measurements have been. It is this, primarily, which distinguishes them. More generally, research undertaken in this part of the project led to the conclusion that, while computer simulations on their own are not measurement practices, they can be embedded in measurement practices in such a way that simulation results constitute measurement outcomes. Research in part (ii) of the project focused primarily on the use of reanalysis datasets to evaluate climate models. Since the simulation models used in producing reanalysis datasets share many assumptions with climate models, it might seem problematic to use reanalysis datasets to evaluate climate models. Investigation here led to a distinction between theory-ladenness and model-ladenness of observation, as well as to the identification of different types of model-ladenness. It was found that the use of reanalysis datasets to evaluate climate models does not involve a "vicious" form of model-ladenness, but there is the possibility of artificial inflation of model-data fit. Determining whether such artificial inflation of fit is occurring, however, is not always easy, for reasons explained. Intellectual contributions The project contributes to the field of philosophy of science in a number of ways: by helping to clarify the relationship between computer simulation and measurement; by showing how computer simulation can be fruitfully embedded in measurement practices; by distinguishing theory-ladenness and model-ladenness of observation; and by highlighting methodological issues that arise in climate science that merit attention from philosophers. Findings of the project have been presented in numerous talks to audiences in philosophy of science and science studies, both in the U.S. and abroad, and to students in an undergraduate philosophy club. These findings are to be published in two articles in leading philosophy of science journals. The project contributes to the fields of meteorology and climate science by clarifying how reanalysis datasets relate to more familiar observations and measurements and by showing that the dependence of reanalysis datasets on computer simulation is not necessarily problematic; caution is needed when using reanalysis datasets, but for reasons related to calibration. These findings were communicated to scientists via a poster presented at a major meeting of geophysical scientists and are to be published in an article in an atmospheric science journal. They were also presented to students in an undergraduate meteorology club. Broader impacts Reanalysis datasets are employed in a host of ways in climate change research, including research that is relevant to climate policy. It is hoped that this project’s findings can contribute to a deeper understanding of the nature of reanalysis datasets and of their strengths and limitations. Such understanding in turn can facilitate better and more circumspect use of reanalysis datasets in the course of climate change research, strengthening the scientific base for climate policy.

National Science Foundation (NSF)
Division of Social and Economic Sciences (SES)
Application #
Program Officer
Frederick M Kronz
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Ohio University
United States
Zip Code