Taxonomic monographs combine information about the appearance, occurrence, genealogical relationships, and classification of organisms and provide methods for their identification. This baseline information is critical for understanding the diversity of life and helps inform other biological research questions. Current taxonomic monographs suffer from being static documents that do not provide an efficient means of accessing or analyzing the underlying data. This project will develop a linked set of computational tools (DynaMo) to produce a dynamic monograph that streamlines the workflow, integrates specimen-based data, and facilitates analysis of these data. Despite being in the early stages of development DynaMo shows immense potential to transform how taxonomic monographs are generated, analyzed and presented. The project will train a post-doctoral researcher and undergraduates, including members of underrepresented groups, in systematic biology, bioinformatics, and software development. Two workshops on dynamic monographs and quantitative taxonomy will help train the broader systematics community. The resulting DynaMo software, and all data generated by the project, will be made freely available through public databases and repositories.

While other areas of systematics have adopted modern approaches, the production of taxonomic monographs has been hindered by the lack of efficient computational tools to efficiently integrate and update the large data sets and complex analyses that underly monographic research. This project will develop the digital infrastructure (DynaMo) to enable systematists to document and interact with all the data management and analysis tools needed to produce a monograph and help make monographic methods and analyses more transparent, reproducible, and extendable. The digital infrastructure underpinning this workflow will be a generic relational database that stores all primary biodiversity data as well as intermediate resources for faster retrieval. The data stored in the database will be integrated and analyzed via modular pipelines. This database will be continuously updated as new data and results become available and the analysis pipelines will automatically generate new results whenever the underlying data in the database change. The results of analyses, interactive visualizations, raw data, and text will be machine-readable and accessible via a dynamic monograph, an online document that is perpetually up-to-date because it provides an interface to the underlying specimen-based database.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Environmental Biology (DEB)
Type
Standard Grant (Standard)
Application #
1939128
Program Officer
Katharina Dittmar
Project Start
Project End
Budget Start
2019-09-01
Budget End
2021-08-31
Support Year
Fiscal Year
2019
Total Cost
$279,967
Indirect Cost
Name
University of California Los Angeles
Department
Type
DUNS #
City
Los Angeles
State
CA
Country
United States
Zip Code
90095