The engineering of ontologies that define the entities in an application area and the relationships among them has become essential for modern work in biomedicine. Ontologies help both humans and computers to manage burgeoning numbers of data. The need to annotate, retrieve, and integrate high-throughput data sets, to process natural language, and to build systems for decision support has set many communities of investigators to work building large ontologies. The Prot?g? system has become an indispensable open-source resource for an enormous internationa community of scientists-supporting the development, maintenance, and use of ontologies and electronic knowledge bases by biomedical investigators everywhere. The number of registered Prot?g? users has grown from 3,500 in 2002 to more than 195,000 users as of this writing. To date, however, the use of ontologies in biomedicine has been limited by the complexity of the ontology-development tools, which often make ontologies inaccessible to many biomedical scientists. In this proposal, we will develop new methods and tools that will significantly lower the barrier of entry for ontology development, expanding Prot?g? to provide intuitive and user-friendly ontology-acquisition methods throughout the ontology lifecycle. Our plan entails five specific aims. First, we will develop methods that enable initial specification of ontology terms in an informal manner, using lists and diagrams. Scientists will be able to start modeling their domain without having to think in terms of formal ontological distinctions. Second, we will provide intuitive, easy-to-use tools for ontology specification that will aid developers as they start to formalize their models. Third, we will track the requirements that an ontology must address and develop novel methods for evaluating ontology coverage based on these requirements. Fourth, for ontologies that inherently have complex internal structure that cannot be represented fully using only simple ontology constructs, we will develop methods that will create templates covering regular structures in the ontology. Scientists will then be able to fill out forms based o these templates, with Prot?g? generating the corresponding logical structure in the background. Fifth, we will continue to expand and support the thriving Prot?g? user community, as it expands to include the biomedical scientists who will now be able to build the ontologies to support their data-driven research and discoveries.

Public Health Relevance

Prot g is a software system that helps a burgeoning user community to develop ontologies that enhance biomedical research and improve patient care. Prot g supports scientists; clinician researchers; and workers in informatics in data annotation; data integration; information retrieval; natural-language processing; electronic patient record systems; and decision-support systems. The Prot g resource provides critical semantic- technology infrastructure and expertise for biomedical research and the development of advanced clinical information systems.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Brazhnik, Paul
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Stanford University
Internal Medicine/Medicine
Schools of Medicine
United States
Zip Code
?awrynowicz, Agnieszka; Potoniec, Jedrzej; Robaczyk, Micha? et al. (2018) Discovery of Emerging Design Patterns in Ontologies Using Tree Mining. Semant Web 9:517-544
Kamdar, Maulik R; Walk, Simon; Tudorache, Tania et al. (2018) Analyzing user interactions with biomedical ontologies: A visual perspective. Web Semant 49:16-30
Lou, Yun; Tu, Samson W; Nyulas, Csongor et al. (2017) Use of ontology structure and Bayesian models to aid the crowdsourcing of ICD-11 sanctioning rules. J Biomed Inform 68:20-34
Gonçalves, Rafael S; Tu, Samson W; Nyulas, Csongor I et al. (2017) An ontology-driven tool for structured data acquisition using Web forms. J Biomed Semantics 8:26
Kamdar, Maulik R; Tudorache, Tania; Musen, Mark A (2017) A Systematic Analysis of Term Reuse and Term Overlap across Biomedical Ontologies. Semant Web 8:853-871
Ziaimatin, Hasti; Groza, Tudor; Tudorache, Tania et al. (2016) Modelling expertise at different levels of granularity using semantic similarity measures in the context of collaborative knowledge-curation platforms. J Intell Inf Syst 47:469-490
Kamdar, Maulik R; Tudorache, Tania; Musen, Mark A (2015) Investigating Term Reuse and Overlap in Biomedical Ontologies. CEUR Workshop Proc 1515:
Groza, Tudor; Tudorache, Tania; Robinson, Peter N et al. (2015) Capturing domain knowledge from multiple sources: the rare bone disorders use case. J Biomed Semantics 6:21
Mortensen, Jonathan M; Minty, Evan P; Januszyk, Michael et al. (2015) Using the wisdom of the crowds to find critical errors in biomedical ontologies: a study of SNOMED CT. J Am Med Inform Assoc 22:640-8
Lamprecht, Daniel; Strohmaier, Markus; Helic, Denis et al. (2015) Using ontologies to model human navigation behavior in information networks: A study based on Wikipedia. Semant Web 6:403-422

Showing the most recent 10 out of 16 publications