The development of ontologies that define entities and the relationships among them has become essential for modern work in biomedicine. Ontologies help both humans and computers to manage the burgeoning data that are pervasive in biology and medicine. The need to annotate, retrieve, and integrate high-throughput data sets, to process natural language, and to build systems for decision support has set many communities of investigators to work building large ontologies. To date, these groups of ontology developers have been limited by the lack of methods and tools that facilitate distributed, collaborative engineering of large-scale ontologies and vocabularies. In this proposal, we outline three specific aims. First, we will explore basic computational methods that are essential for collaborative ontology engineering. We will investigate methods for representing diverse collaborative workflows, information about changes and concept history, trust, and provenance, and for recording decision making and design rationale. Empirical analysis of existing ontology-development projects will inform our construction of models for collaborative development workflows that will guide the processes of authoring, reviewing, and curating biomedical ontologies. Second, we will use the results from our first specific aim to build cProtigi, a set of robust, customizable, interactive tools to support distributed users in their collaborative work to build and edit terminologies and ontologies. Third, we will evaluate our work in the context of real-world, large-scale ontology-engineering projects, including the autism ontology of the National Database for Autism Research;the 11th revision of the WHO's International Classification of Diseases;the Ontology for Biomedical Investigations, under development by a wide range of NIH-supported researchers;and BiomedGT, under development by NCI. It is no longer feasible to imagine that investigators can create biomedical ontologies working independently. The collaborative methods that we will study and the tools that we will build will lead to expanded opportunities to support the diverse data- and knowledge-intensive activities that pervade BISTI, the CTSAs, the NCBCs, and myriad biomedical initiatives that require robust, scaleable ontologies.

Public Health Relevance

The knowledge-based nature of modern medicine requires the use of ontologies and terminologies to process and integrate data. Ontology development itself becomes a collaborative process, with members of the larger research community contributing to and commenting on emerging ontologies. We plan to extend the Protigi ontology editor-the most widely used ontology editor today, with almost 100,000 registered users-to support collaborative development of ontologies and to evaluate the new tools by deploying them at the World Health Organization for the development of ICD-11 and in other settings.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Brazhnik, Paul
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Stanford University
Internal Medicine/Medicine
Schools of Medicine
United States
Zip Code
Mortensen, Jonathan M; Minty, Evan P; Januszyk, Michael et al. (2015) Using the wisdom of the crowds to find critical errors in biomedical ontologies: a study of SNOMED CT. J Am Med Inform Assoc 22:640-8
Walk, Simon; Singer, Philipp; Strohmaier, Markus et al. (2014) Discovering beaten paths in collaborative ontology-engineering projects using Markov chains. J Biomed Inform 51:254-71
Strohmaier, Markus; Walk, Simon; Poschko, Jan et al. (2013) How Ontologies are Made: Studying the Hidden Social Dynamics Behind Collaborative Ontology Engineering Projects. Web Semant 20:
Tudorache, Tania; Nyulas, Csongor; Noy, Natalya F et al. (2013) WebProtege: A Collaborative Ontology Editor and Knowledge Acquisition Tool for the Web. Semant Web 4:89-99