Biomedical ontologies are increasingly important in genomic and proteomic research where complex data in disparate resources need to be integrated. In particular, the Gene Ontology (GO) has become a standard for genome annotation and the Open Biomedical Ontologies (OBO) is an umbrella for ontologies shared across biomedical domains. There is, however, a gap in the current OBO library-an ontology of protein classes and their relationships. The goal of the proposed project is to develop a PRotein Ontology (PRO) to facilitate protein annotation and functional discovery. PRO will be developed within the OBO Foundry, adopting principles specifying best practices in ontology development.
The specific aims of this project are to: (i) develop a Protein Evolution (ProEvo) ontology for the description of proteins based on evolutionary relationships, (ii) develop an ontology for Protein Modified Forms (ProMod) for the representation of multiple protein forms of a gene, (iii) specify relationships between PRO, GO and other OBO ontologies, and (iv) disseminate PRO ontology and develop scientific case studies. ProEvo will be developed based on manually-curated families of full-length proteins in PIRSF and PANTHER and their constituent domains in SCOP and Pfam, initially for human protein-containing classes. The ProEvo classes and their relationships with GO terms will formalize the relationships between phytogeny and function, to allow more consistent and accurate inference of function based on experimental evidence. ProMod will define protein products generated by genetic variation, alternative splicing, proteolytic cleavage, and post-translational modification, initially for human and mouse proteins using fully-curated entries in UniProtKB/Swiss-Prot and the Mouse Genome Initiative, to support specific annotation of proteomes at the precise levels of variants, isoforms, and modified products. Defined based on scientific case studies of human and mouse disease proteins, the relations between PRO, GO, and other OBO ontologies, such as Disease Ontology, will capture the relationships required for disease understanding. For PRO .dissemination to the community, the ontology will be integrated into OBO, new relations will be added to the OBO Relations Ontology, and an annual protein ontology workshop will be organized. PRO will also be accessible from the PIR web site for integrative protein analysis. Through scientific meetings and collaborations, the PRO consortium will interact with the wider scientific community to ensure that PRO is useful and widely adopted. The PRO ontology will allow researchers to explore functional and evolutionary relationships of proteins to improve understanding of disease and identify potential diagnostic and therapeutic targets.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Remington, Karin A
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Delaware
Biostatistics & Other Math Sci
Schools of Engineering
United States
Zip Code
Zheng, Jie; Harris, Marcelline R; Masci, Anna Maria et al. (2016) The Ontology of Biological and Clinical Statistics (OBCS) for standardized and reproducible statistical analysis. J Biomed Semantics 7:53
Pundir, Sangya; Martin, Maria J; O'Donovan, Claire et al. (2016) UniProt Tools. Curr Protoc Bioinformatics 53:1.29.1-15
Huang, Jingshan; Gutierrez, Fernando; Strachan, Harrison J et al. (2016) OmniSearch: a semantic search system based on the Ontology for MIcroRNA Target (OMIT) for microRNA-target gene interaction data. J Biomed Semantics 7:25
Diehl, Alexander D; Meehan, Terrence F; Bradford, Yvonne M et al. (2016) The Cell Ontology 2016: enhanced content, modularization, and ontology interoperability. J Biomed Semantics 7:44
Boutet, Emmanuel; Lieberherr, Damien; Tognolli, Michael et al. (2016) UniProtKB/Swiss-Prot, the Manually Annotated Section of the UniProt KnowledgeBase: How to Use the Entry View. Methods Mol Biol 1374:23-54
Huang, Jingshan; Eilbeck, Karen; Smith, Barry et al. (2016) The Non-Coding RNA Ontology (NCRO): a comprehensive resource for the unification of non-coding RNA biology. J Biomed Semantics 7:24
Bandrowski, Anita; Brinkman, Ryan; Brochhausen, Mathias et al. (2016) The Ontology for Biomedical Investigations. PLoS One 11:e0154556
Holliday, Gemma L; Bairoch, Amos; Bagos, Pantelis G et al. (2015) Key challenges for the creation and maintenance of specialist protein resources. Proteins 83:1005-13
Çelen, İrem; Ross, Karen E; Arighi, Cecilia N et al. (2015) Bioinformatics Knowledge Map for Analysis of Beta-Catenin Function in Cancer. PLoS One 10:e0141773
Courtot, Mélanie; Meskas, Justin; Diehl, Alexander D et al. (2015) flowCL: ontology-based cell population labelling in flow cytometry. Bioinformatics 31:1337-9

Showing the most recent 10 out of 43 publications