This proposal will examine the biological basis of mathematical modeling in computational biology. Systematic biologists make classifications and phylogenies of organisms. Doing so depends on the idea that adding more data will cause their results to converge statistically. However, the biological basis for this convergence varies considerably and is heavily contested across inference methods and organisms. This project will approach the problem of justifying statistical convergence in systematic biology by adopting a hybrid approach that integrates philosophical analysis with a historical study of the reception of statistical methods in systematics over the past sixty years, including numerical taxonomy, cladistics, and maximum likelihood methods. The study will emphasize how the use and value of statistical methods varied across areas of systematics, such as microbiology, zoology, and paleontology. The researchers will develop computational methods for tracing reception in published journal articles in order to achieve a comprehensive, comparative analysis. The project will use the path-breaking OCHRE database system, which provides an innovative data structure in the digital humanities that can support changing analytic categories over time (a semi-structured instead of relational database).
Intellectual Merit This project is an interdisciplinary collaboration integrating biology, philosophy of science, and history of science to study the strengths and limitations of statistical methods in systematics. The Field Museum is an ideal location for this collaboration because of the excellence and diversity of its research in systematics, including cutting-edge projects on integrating data for biodiversity studies, on molecular techniques for the Tree of Life project, and on the history and philosophy of systematics. The study, with its focus on statistics as a constraint on research in systematics, will open a new epistemological perspective on individuality as a complementary problem, especially with regards to individuating structural elements in classifications or phylogenies and with regards to individuating characters in organisms. Examining how different areas of systematics received and appropriated statistical methods will greatly expand current historical knowledge of theoretical advances in microbiology and botany over the past sixty years. In addition, the project will develop research tools for investigating the mathematization of biology and other sciences, an increasingly important problem in the history and philosophy of science. For example, one result of the proposed research will be computational methods for analyzing the transformative effect mathematization had on the structural relations among research problems in systematics.
Potential Broader Impact Using an integrated interdisciplinary approach to address a key practical problem in biology today, the results of this research will cross boundaries to impact science policy in biology and other fields in the digital humanities. The project will produce a targeted evaluation of the basis of statistical convergence that is relevant to current debates among policy makers and biologists about the value of mathematics and high-throughput data gathering techniques. Results will be disseminated in interdisciplinary journals and conferences that regularly include biologists in their audience. New computational methods developed in the project will be incorporated as Java modules in the OCHRE system, which is made available for use in academic research and has a web-based graphical user interface. The new computational methods and associated results will also be presented at workshops on digital humanities and digital history and philosophy of science.