The development of ontologies that define entities and the relationships among them has become essential for modern work in biomedicine. Ontologies help both humans and computers to manage the burgeoning data that are pervasive in biology and medicine. The need to annotate, retrieve, and integrate high-throughput data sets, to process natural language, and to build systems for decision support has set many communities of investigators to work building large ontologies. To date, these groups of ontology developers have been limited by the lack of methods and tools that facilitate distributed, collaborative engineering of large-scale ontologies and vocabularies. In this proposal, we outline three specific aims. First, we will explore basic computational methods that are essential for collaborative ontology engineering. We will investigate methods for representing diverse collaborative workflows, information about changes and concept history, trust, and provenance, and for recording decision making and design rationale. Empirical analysis of existing ontology-development projects will inform our construction of models for collaborative development workflows that will guide the processes of authoring, reviewing, and curating biomedical ontologies. Second, we will use the results from our first specific aim to build cProtigi, a set of robust, customizable, interactive tools to support distributed users in their collaborative work to build and edit terminologies and ontologies. Third, we will evaluate our work in the context of real-world, large-scale ontology-engineering projects, including the autism ontology of the National Database for Autism Research;the 11th revision of the WHO's International Classification of Diseases;the Ontology for Biomedical Investigations, under development by a wide range of NIH-supported researchers;and BiomedGT, under development by NCI. It is no longer feasible to imagine that investigators can create biomedical ontologies working independently. The collaborative methods that we will study and the tools that we will build will lead to expanded opportunities to support the diverse data- and knowledge-intensive activities that pervade BISTI, the CTSAs, the NCBCs, and myriad biomedical initiatives that require robust, scaleable ontologies.

Public Health Relevance

The knowledge-based nature of modern medicine requires the use of ontologies and terminologies to process and integrate data. Ontology development itself becomes a collaborative process, with members of the larger research community contributing to and commenting on emerging ontologies. We plan to extend the Protigi ontology editor-the most widely used ontology editor today, with almost 100,000 registered users-to support collaborative development of ontologies and to evaluate the new tools by deploying them at the World Health Organization for the development of ICD-11 and in other settings.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM086587-02
Application #
7774343
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Brazhnik, Paul
Project Start
2009-03-01
Project End
2013-02-28
Budget Start
2010-03-01
Budget End
2011-02-28
Support Year
2
Fiscal Year
2010
Total Cost
$525,262
Indirect Cost
Name
Stanford University
Department
Internal Medicine/Medicine
Type
Schools of Medicine
DUNS #
009214214
City
Stanford
State
CA
Country
United States
Zip Code
94305
?awrynowicz, Agnieszka; Potoniec, Jedrzej; Robaczyk, Micha? et al. (2018) Discovery of Emerging Design Patterns in Ontologies Using Tree Mining. Semant Web 9:517-544
Kamdar, Maulik R; Walk, Simon; Tudorache, Tania et al. (2018) Analyzing user interactions with biomedical ontologies: A visual perspective. Web Semant 49:16-30
Lou, Yun; Tu, Samson W; Nyulas, Csongor et al. (2017) Use of ontology structure and Bayesian models to aid the crowdsourcing of ICD-11 sanctioning rules. J Biomed Inform 68:20-34
Gonçalves, Rafael S; Tu, Samson W; Nyulas, Csongor I et al. (2017) An ontology-driven tool for structured data acquisition using Web forms. J Biomed Semantics 8:26
Kamdar, Maulik R; Walk, Simon; Tudorache, Tania et al. (2017) BiOnIC: A Catalog of User Interactions with Biomedical Ontologies. Semant Web ISWC 10588:130-138
Kamdar, Maulik R; Tudorache, Tania; Musen, Mark A (2017) A Systematic Analysis of Term Reuse and Term Overlap across Biomedical Ontologies. Semant Web 8:853-871
Ochs, Christopher; Perl, Yehoshua; Geller, James et al. (2017) An empirical analysis of ontology reuse in BioPortal. J Biomed Inform 71:165-177
Mortensen, Jonathan M; Telis, Natalie; Hughey, Jacob J et al. (2016) Is the crowd better as an assistant or a replacement in ontology engineering? An exploration through the lens of the Gene Ontology. J Biomed Inform 60:199-209
Kamdar, Maulik R; Wu, Michelle J (2016) PRISM: A DATA-DRIVEN PLATFORM FOR MONITORING MENTAL HEALTH. Pac Symp Biocomput 21:333-44
Ziaimatin, Hasti; Groza, Tudor; Tudorache, Tania et al. (2016) Modelling expertise at different levels of granularity using semantic similarity measures in the context of collaborative knowledge-curation platforms. J Intell Inf Syst 47:469-490

Showing the most recent 10 out of 24 publications