The development of ontologies that define entities and the relationships among them has become essential for modern work in biomedicine. Ontologies help both humans and computers to manage the burgeoning data that are pervasive in biology and medicine. The need to annotate, retrieve, and integrate high-throughput data sets, to process natural language, and to build systems for decision support has set many communities of investigators to work building large ontologies. To date, these groups of ontology developers have been limited by the lack of methods and tools that facilitate distributed, collaborative engineering of large-scale ontologies and vocabularies. In this proposal, we outline three specific aims. First, we will explore basic computational methods that are essential for collaborative ontology engineering. We will investigate methods for representing diverse collaborative workflows, information about changes and concept history, trust, and provenance, and for recording decision making and design rationale. Empirical analysis of existing ontology-development projects will inform our construction of models for collaborative development workflows that will guide the processes of authoring, reviewing, and curating biomedical ontologies. Second, we will use the results from our first specific aim to build cProtigi, a set of robust, customizable, interactive tools to support distributed users in their collaborative work to build and edit terminologies and ontologies. Third, we will evaluate our work in the context of real-world, large-scale ontology-engineering projects, including the autism ontology of the National Database for Autism Research;the 11th revision of the WHO's International Classification of Diseases;the Ontology for Biomedical Investigations, under development by a wide range of NIH-supported researchers;and BiomedGT, under development by NCI. It is no longer feasible to imagine that investigators can create biomedical ontologies working independently. The collaborative methods that we will study and the tools that we will build will lead to expanded opportunities to support the diverse data- and knowledge-intensive activities that pervade BISTI, the CTSAs, the NCBCs, and myriad biomedical initiatives that require robust, scaleable ontologies.

Public Health Relevance

The knowledge-based nature of modern medicine requires the use of ontologies and terminologies to process and integrate data. Ontology development itself becomes a collaborative process, with members of the larger research community contributing to and commenting on emerging ontologies. We plan to extend the Protigi ontology editor-the most widely used ontology editor today, with almost 100,000 registered users-to support collaborative development of ontologies and to evaluate the new tools by deploying them at the World Health Organization for the development of ICD-11 and in other settings.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM086587-04
Application #
8242742
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Brazhnik, Paul
Project Start
2009-03-01
Project End
2013-02-28
Budget Start
2012-03-01
Budget End
2013-02-28
Support Year
4
Fiscal Year
2012
Total Cost
$392,767
Indirect Cost
$124,949
Name
Stanford University
Department
Internal Medicine/Medicine
Type
Schools of Medicine
DUNS #
009214214
City
Stanford
State
CA
Country
United States
Zip Code
94305
Mortensen, Jonathan M; Telis, Natalie; Hughey, Jacob J et al. (2016) Is the crowd better as an assistant or a replacement in ontology engineering? An exploration through the lens of the Gene Ontology. J Biomed Inform 60:199-209
Lamprecht, Daniel; Strohmaier, Markus; Helic, Denis et al. (2015) Using ontologies to model human navigation behavior in information networks: A study based on Wikipedia. Semant Web 6:403-422
Groza, Tudor; Tudorache, Tania; Robinson, Peter N et al. (2015) Capturing domain knowledge from multiple sources: the rare bone disorders use case. J Biomed Semantics 6:21
Wang, Hao; Tudorache, Tania; Dou, Dejing et al. (2015) Analysis and Prediction of User Editing Patterns in Ontology Development Projects. J Data Semant 4:117-132
Musen, Mark A; Protégé Team (2015) The Protégé Project: A Look Back and a Look Forward. AI Matters 1:4-12
Mortensen, Jonathan M; Minty, Evan P; Januszyk, Michael et al. (2015) Using the wisdom of the crowds to find critical errors in biomedical ontologies: a study of SNOMED CT. J Am Med Inform Assoc 22:640-8
Mortensen, Jonathan M; Musen, Mark A; Noy, Natalya F (2014) An empirically derived taxonomy of errors in SNOMED CT. AMIA Annu Symp Proc 2014:899-906
Horridge, Matthew; Tudorache, Tania; Nuylas, Csongor et al. (2014) WebProtégé: a collaborative Web-based platform for editing biomedical ontologies. Bioinformatics 30:2384-5
Walk, Simon; Singer, Philipp; Strohmaier, Markus et al. (2014) Discovering beaten paths in collaborative ontology-engineering projects using Markov chains. J Biomed Inform 51:254-71
Strohmaier, Markus; Walk, Simon; Pöschko, Jan et al. (2013) How Ontologies are Made: Studying the Hidden Social Dynamics Behind Collaborative Ontology Engineering Projects. Web Semant 20:

Showing the most recent 10 out of 14 publications