The development of ontologies that define entities and the relationships among them has become essential for modern work in biomedicine. Ontologies help both humans and computers to manage the burgeoning data that are pervasive in biology and medicine. The need to annotate, retrieve, and integrate high-throughput data sets, to process natural language, and to build systems for decision support has set many communities of investigators to work building large ontologies. To date, these groups of ontology developers have been limited by the lack of methods and tools that facilitate distributed, collaborative engineering of large-scale ontologies and vocabularies. In this proposal, we outline three specific aims. First, we will explore basic computational methods that are essential for collaborative ontology engineering. We will investigate methods for representing diverse collaborative workflows, information about changes and concept history, trust, and provenance, and for recording decision making and design rationale. Empirical analysis of existing ontology-development projects will inform our construction of models for collaborative development workflows that will guide the processes of authoring, reviewing, and curating biomedical ontologies. Second, we will use the results from our first specific aim to build cProtigi, a set of robust, customizable, interactive tools to support distributed users in their collaborative work to build and edit terminologies and ontologies. Third, we will evaluate our work in the context of real-world, large-scale ontology-engineering projects, including the autism ontology of the National Database for Autism Research;the 11th revision of the WHO's International Classification of Diseases;the Ontology for Biomedical Investigations, under development by a wide range of NIH-supported researchers;and BiomedGT, under development by NCI. It is no longer feasible to imagine that investigators can create biomedical ontologies working independently. The collaborative methods that we will study and the tools that we will build will lead to expanded opportunities to support the diverse data- and knowledge-intensive activities that pervade BISTI, the CTSAs, the NCBCs, and myriad biomedical initiatives that require robust, scaleable ontologies.

Public Health Relevance

The knowledge-based nature of modern medicine requires the use of ontologies and terminologies to process and integrate data. Ontology development itself becomes a collaborative process, with members of the larger research community contributing to and commenting on emerging ontologies. We plan to extend the Protigi ontology editor-the most widely used ontology editor today, with almost 100,000 registered users-to support collaborative development of ontologies and to evaluate the new tools by deploying them at the World Health Organization for the development of ICD-11 and in other settings.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM086587-02
Application #
7774343
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Brazhnik, Paul
Project Start
2009-03-01
Project End
2013-02-28
Budget Start
2010-03-01
Budget End
2011-02-28
Support Year
2
Fiscal Year
2010
Total Cost
$525,262
Indirect Cost
Name
Stanford University
Department
Internal Medicine/Medicine
Type
Schools of Medicine
DUNS #
009214214
City
Stanford
State
CA
Country
United States
Zip Code
94305
Ochs, Christopher; Perl, Yehoshua; Geller, James et al. (2017) An empirical analysis of ontology reuse in BioPortal. J Biomed Inform 71:165-177
Gonçalves, Rafael S; Tu, Samson W; Nyulas, Csongor I et al. (2017) An ontology-driven tool for structured data acquisition using Web forms. J Biomed Semantics 8:26
Lou, Yun; Tu, Samson W; Nyulas, Csongor et al. (2017) Use of ontology structure and Bayesian models to aid the crowdsourcing of ICD-11 sanctioning rules. J Biomed Inform 68:20-34
Kamdar, Maulik R; Tudorache, Tania; Musen, Mark A (2017) A Systematic Analysis of Term Reuse and Term Overlap across Biomedical Ontologies. Semant Web 8:853-871
Mortensen, Jonathan M; Telis, Natalie; Hughey, Jacob J et al. (2016) Is the crowd better as an assistant or a replacement in ontology engineering? An exploration through the lens of the Gene Ontology. J Biomed Inform 60:199-209
Ziaimatin, Hasti; Groza, Tudor; Tudorache, Tania et al. (2016) Modelling expertise at different levels of granularity using semantic similarity measures in the context of collaborative knowledge-curation platforms. J Intell Inf Syst 47:469-490
Mortensen, Jonathan M; Minty, Evan P; Januszyk, Michael et al. (2015) Using the wisdom of the crowds to find critical errors in biomedical ontologies: a study of SNOMED CT. J Am Med Inform Assoc 22:640-8
Groza, Tudor; Tudorache, Tania; Robinson, Peter N et al. (2015) Capturing domain knowledge from multiple sources: the rare bone disorders use case. J Biomed Semantics 6:21
Lamprecht, Daniel; Strohmaier, Markus; Helic, Denis et al. (2015) Using ontologies to model human navigation behavior in information networks: A study based on Wikipedia. Semant Web 6:403-422
Wang, Hao; Tudorache, Tania; Dou, Dejing et al. (2015) Analysis and Prediction of User Editing Patterns in Ontology Development Projects. J Data Semant 4:117-132

Showing the most recent 10 out of 19 publications