Biomedical ontologies are critical tools for the accurate representation and integration of genome-scale data in biomedical and translational research. The OBO (Open Biological and Biomedical Ontologies) Foundry is a community effort to develop a systematic and coordinated framework for evidence-based ontology development on the basis of an evolving set of best practice principles. The Protein Ontology (PRO) is the reference ontology for proteins within the OBO Foundry, and is, with the Gene Ontology, one of the first six ontologies recommended by the Foundry as preferred targets for community convergence. To provide a basic ontological framework to capture protein knowledge in a systems biology context, PRO encompasses three sub-ontologies to represent (1) proteins from homologous genes based on evolutionary relatedness (ProEvo);(2) protein forms produced from a given gene, including splice isoforms, mutation variants, and co- or post-translationally modified forms (ProForm);and (3) protein-containing complexes (ProComp). This competitive renewal grant application aims to further develop PRO in order to facilitate its semantic and computational use by the biomedical research community and thereby broaden its scientific impact for discovery and reasoning in the health sciences.
The specific aims are: (i) to enhance the PRO ontological framework;(ii) to broaden the coverage of protein objects;(iii) to enhance the PRO curation platform, website and visual representation;(iv) to develop driving clinical projects;and (v) to expand the scientific impact, adoption and dissemination of PRO. The ontological framework will capture new types of protein objects and relations and connect to semantic resources and reasoning tools. PRO will broaden coverage through mappings and definitions of relations to connect protein objects in existing knowledge bases, and via semi-automated import of protein forms and complexes from curated databases. A graphical network representation will seamlessly connect protein forms and complexes across tax in biological context for disease modeling. Use cases and two specific Driving Clinical Projects-one for reasoning and hypothesis generation for Alzheimer's disease, and one for flow cytometry data representation and immune system modeling-will demonstrate knowledge integration in the OBO Foundry framework as an enabling research infrastructure for reasoning and modeling in the health sciences. We will host annual PRO Scientific Dissemination Meetings addressing the protein-related needs of the bio- and clinical informatics research communities. PRO will be disseminated via multiple websites and ontological services, as well as through reciprocal links with major knowledge resources. ID management for protein objects will include mapping of PRO terms to common database identifiers with well-defined relations and expedited creation of requested PRO terms and UIDs. The PRO research has several unique features and its significance is multi-fold. For knowledge representation, PRO defines precise protein objects to support accurate annotation at the appropriate granularity and provides the ontological framework to connect all protein types necessary to model biology, in particular linking specific protein forms to particular complexes in biological context. For semantic data integration, PRO provides the ontological structure to connect-via specified relations- the vast amounts of protein knowledge contained in databases to support new hypothesis generation and testing. PRO therefore addresses the current gaps in the bioinformatics infrastructure for protein representations in a way that makes knowledge about proteins more accessible to computational reasoning, fully leveraging and complementing existing knowledge sources. The proposed research will allow the PRO Consortium to bring together the resources and expertise from several collaborating institutions to deepen and broaden PRO as a mature research infrastructure for biomedical knowledge discovery and translational science.

Public Health Relevance

The PRO ontology will allow researchers to capture and accurately represent scientific knowledge of proteins, providing a research infrastructure for modeling biological systems, improving the understanding of human disease, and aiding in the identification of potential diagnostic and therapeutic targets.

Agency
National Institute of Health (NIH)
Type
Research Project (R01)
Project #
5R01GM080646-09
Application #
8700422
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Ravichandran, Veerasamy
Project Start
Project End
Budget Start
Budget End
Support Year
9
Fiscal Year
2014
Total Cost
Indirect Cost
Name
University of Delaware
Department
Biostatistics & Other Math Sci
Type
Biomed Engr/Col Engr/Engr Sta
DUNS #
City
Newark
State
DE
Country
United States
Zip Code
19716
Zheng, Jie; Harris, Marcelline R; Masci, Anna Maria et al. (2016) The Ontology of Biological and Clinical Statistics (OBCS) for standardized and reproducible statistical analysis. J Biomed Semantics 7:53
Pundir, Sangya; Martin, Maria J; O'Donovan, Claire et al. (2016) UniProt Tools. Curr Protoc Bioinformatics 53:1.29.1-15
Huang, Jingshan; Gutierrez, Fernando; Strachan, Harrison J et al. (2016) OmniSearch: a semantic search system based on the Ontology for MIcroRNA Target (OMIT) for microRNA-target gene interaction data. J Biomed Semantics 7:25
Diehl, Alexander D; Meehan, Terrence F; Bradford, Yvonne M et al. (2016) The Cell Ontology 2016: enhanced content, modularization, and ontology interoperability. J Biomed Semantics 7:44
Boutet, Emmanuel; Lieberherr, Damien; Tognolli, Michael et al. (2016) UniProtKB/Swiss-Prot, the Manually Annotated Section of the UniProt KnowledgeBase: How to Use the Entry View. Methods Mol Biol 1374:23-54
Huang, Jingshan; Eilbeck, Karen; Smith, Barry et al. (2016) The Non-Coding RNA Ontology (NCRO): a comprehensive resource for the unification of non-coding RNA biology. J Biomed Semantics 7:24
Bandrowski, Anita; Brinkman, Ryan; Brochhausen, Mathias et al. (2016) The Ontology for Biomedical Investigations. PLoS One 11:e0154556
Holliday, Gemma L; Bairoch, Amos; Bagos, Pantelis G et al. (2015) Key challenges for the creation and maintenance of specialist protein resources. Proteins 83:1005-13
Çelen, İrem; Ross, Karen E; Arighi, Cecilia N et al. (2015) Bioinformatics Knowledge Map for Analysis of Beta-Catenin Function in Cancer. PLoS One 10:e0141773
Courtot, Mélanie; Meskas, Justin; Diehl, Alexander D et al. (2015) flowCL: ontology-based cell population labelling in flow cytometry. Bioinformatics 31:1337-9

Showing the most recent 10 out of 43 publications