The construction of ontologies that define the entities in an application area and the relationships among them has become essential for modern work in biomedicine. Ontologies help both humans and computers to manage burgeoning numbers of data. The need to annotate, retrieve, and integrate high-throughput data sets, to process natural language, and to build systems for decision support has set many communities of biomedical investigators to work building large ontologies. We developed and evaluated the Collaborative Prot?g? system in the first phase of our research project. This software system has become an indispensable open-source resource for an international community of scientists who develop ontologies in a cooperative, distributed manner. In this competing renewal proposal, we describe novel data-driven methods and tools that promise to make collaborative ontology design both more streamlined and more principled. Our goal is to create a more empirical basis for ontology engineering, and to develop methods whereby the ontology-engineering enterprise both can profit from data regarding the underlying processes and those processes in turn can generate increasing amounts of data to inform future ontology-engineering activities. Our research plan entails three specific aims. First, we will enable ontology developers to apply ontology-design patterns (ODPs) to their ontologies, and we will measure the way in which these patterns alter the ontology-engineering process. Second, we will analyze the vast amounts of log data that we collect from users of Collaborative Prot?g? to understand the patterns of ontology development. We will use these patterns to recommend to developers areas of ontologies that may need their attention, facilitating the process of reaching consensus and making collaborative ontology engineering more efficient. Finally, we will use the extensive data collected by our group and others to understand how scientists reuse terms from various ontologies and we will use these emerging patterns to facilitate term reuse. Each of these analyses not only will increase our understanding of collaboration in scientific modeling, but also will lead to new technology within our Collaborative Prot?g? suite that will improve the ontology-development process and make collaboration among biomedical scientists more efficient.

Public Health Relevance

Collaborative Prot g is a software system that helps a burgeoning user community to cooperate in developing ontologies that enhance biomedical research and improve patient care. Collaborative Prot g supports scientists; clinician researchers; and workers in informatics to build ontologies to solve problems in data annotation; data integration; information retrieval; natural-language processing; electronic patient record systems; and decision support. The proposed research will develop data-driven methods to identify patterns in design; development; and use of ontologies; and will apply these methods to help us to build new technology that both facilitates the ontology-development process and makes ontology design more principled.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Brazhnik, Paul
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Stanford University
Social Sciences
Schools of Medicine
United States
Zip Code
?awrynowicz, Agnieszka; Potoniec, Jedrzej; Robaczyk, Micha? et al. (2018) Discovery of Emerging Design Patterns in Ontologies Using Tree Mining. Semant Web 9:517-544
Kamdar, Maulik R; Walk, Simon; Tudorache, Tania et al. (2018) Analyzing user interactions with biomedical ontologies: A visual perspective. Web Semant 49:16-30
Lou, Yun; Tu, Samson W; Nyulas, Csongor et al. (2017) Use of ontology structure and Bayesian models to aid the crowdsourcing of ICD-11 sanctioning rules. J Biomed Inform 68:20-34
Gonçalves, Rafael S; Tu, Samson W; Nyulas, Csongor I et al. (2017) An ontology-driven tool for structured data acquisition using Web forms. J Biomed Semantics 8:26
Kamdar, Maulik R; Walk, Simon; Tudorache, Tania et al. (2017) BiOnIC: A Catalog of User Interactions with Biomedical Ontologies. Semant Web ISWC 10588:130-138
Kamdar, Maulik R; Tudorache, Tania; Musen, Mark A (2017) A Systematic Analysis of Term Reuse and Term Overlap across Biomedical Ontologies. Semant Web 8:853-871
Ochs, Christopher; Perl, Yehoshua; Geller, James et al. (2017) An empirical analysis of ontology reuse in BioPortal. J Biomed Inform 71:165-177
Mortensen, Jonathan M; Telis, Natalie; Hughey, Jacob J et al. (2016) Is the crowd better as an assistant or a replacement in ontology engineering? An exploration through the lens of the Gene Ontology. J Biomed Inform 60:199-209
Kamdar, Maulik R; Wu, Michelle J (2016) PRISM: A DATA-DRIVEN PLATFORM FOR MONITORING MENTAL HEALTH. Pac Symp Biocomput 21:333-44
Ziaimatin, Hasti; Groza, Tudor; Tudorache, Tania et al. (2016) Modelling expertise at different levels of granularity using semantic similarity measures in the context of collaborative knowledge-curation platforms. J Intell Inf Syst 47:469-490

Showing the most recent 10 out of 24 publications