The construction of ontologies that define the entities in an application area and the relationships among them has become essential for modern work in biomedicine. Ontologies help both humans and computers to manage burgeoning numbers of data. The need to annotate, retrieve, and integrate high-throughput data sets, to process natural language, and to build systems for decision support has set many communities of biomedical investigators to work building large ontologies. We developed and evaluated the Collaborative Prot?g? system in the first phase of our research project. This software system has become an indispensable open-source resource for an international community of scientists who develop ontologies in a cooperative, distributed manner. In this competing renewal proposal, we describe novel data-driven methods and tools that promise to make collaborative ontology design both more streamlined and more principled. Our goal is to create a more empirical basis for ontology engineering, and to develop methods whereby the ontology-engineering enterprise both can profit from data regarding the underlying processes and those processes in turn can generate increasing amounts of data to inform future ontology-engineering activities. Our research plan entails three specific aims. First, we will enable ontology developers to apply ontology-design patterns (ODPs) to their ontologies, and we will measure the way in which these patterns alter the ontology-engineering process. Second, we will analyze the vast amounts of log data that we collect from users of Collaborative Prot?g? to understand the patterns of ontology development. We will use these patterns to recommend to developers areas of ontologies that may need their attention, facilitating the process of reaching consensus and making collaborative ontology engineering more efficient. Finally, we will use the extensive data collected by our group and others to understand how scientists reuse terms from various ontologies and we will use these emerging patterns to facilitate term reuse. Each of these analyses not only will increase our understanding of collaboration in scientific modeling, but also will lead to new technology within our Collaborative Prot?g? suite that will improve the ontology-development process and make collaboration among biomedical scientists more efficient.

Public Health Relevance

Collaborative Prot?g? is a software system that helps a burgeoning user community to cooperate in developing ontologies that enhance biomedical research and improve patient care. Collaborative Prot?g? supports scientists, clinician researchers, and workers in informatics to build ontologies to solve problems in data annotation, data integration, information retrieval, natural-language processing, electronic patient record systems, and decision support. The proposed research will develop data-driven methods to identify patterns in design, development, and use of ontologies, and will apply these methods to help us to build new technology that both facilitates the ontology-development process and makes ontology design more principled.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Brazhnik, Paul
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Stanford University
Social Sciences
Schools of Medicine
United States
Zip Code
Mortensen, Jonathan M; Telis, Natalie; Hughey, Jacob J et al. (2016) Is the crowd better as an assistant or a replacement in ontology engineering? An exploration through the lens of the Gene Ontology. J Biomed Inform 60:199-209
Lamprecht, Daniel; Strohmaier, Markus; Helic, Denis et al. (2015) Using ontologies to model human navigation behavior in information networks: A study based on Wikipedia. Semant Web 6:403-422
Groza, Tudor; Tudorache, Tania; Robinson, Peter N et al. (2015) Capturing domain knowledge from multiple sources: the rare bone disorders use case. J Biomed Semantics 6:21
Wang, Hao; Tudorache, Tania; Dou, Dejing et al. (2015) Analysis and Prediction of User Editing Patterns in Ontology Development Projects. J Data Semant 4:117-132
Musen, Mark A; Protégé Team (2015) The Protégé Project: A Look Back and a Look Forward. AI Matters 1:4-12
Mortensen, Jonathan M; Minty, Evan P; Januszyk, Michael et al. (2015) Using the wisdom of the crowds to find critical errors in biomedical ontologies: a study of SNOMED CT. J Am Med Inform Assoc 22:640-8
Mortensen, Jonathan M; Musen, Mark A; Noy, Natalya F (2014) An empirically derived taxonomy of errors in SNOMED CT. AMIA Annu Symp Proc 2014:899-906
Horridge, Matthew; Tudorache, Tania; Nuylas, Csongor et al. (2014) WebProtégé: a collaborative Web-based platform for editing biomedical ontologies. Bioinformatics 30:2384-5
Walk, Simon; Singer, Philipp; Strohmaier, Markus et al. (2014) Discovering beaten paths in collaborative ontology-engineering projects using Markov chains. J Biomed Inform 51:254-71
Strohmaier, Markus; Walk, Simon; Pöschko, Jan et al. (2013) How Ontologies are Made: Studying the Hidden Social Dynamics Behind Collaborative Ontology Engineering Projects. Web Semant 20:

Showing the most recent 10 out of 14 publications