Biomedical ontologies are increasingly important in genomic and proteomic research where complex data in disparate resources need to be integrated. In particular, the Gene Ontology (GO) has become a standard for genome annotation and the Open Biomedical Ontologies (OBO) is an umbrella for ontologies shared across biomedical domains. There is, however, a gap in the current OBO library-an ontology of protein classes and their relationships. The goal of the proposed project is to develop a PRotein Ontology (PRO) to facilitate protein annotation and functional discovery. PRO will be developed within the OBO Foundry, adopting principles specifying best practices in ontology development.
The specific aims of this project are to: (i) develop a Protein Evolution (ProEvo) ontology for the description of proteins based on evolutionary relationships, (ii) develop an ontology for Protein Modified Forms (ProMod) for the representation of multiple protein forms of a gene, (iii) specify relationships between PRO, GO and other OBO ontologies, and (iv) disseminate PRO ontology and develop scientific case studies. ProEvo will be developed based on manually-curated families of full-length proteins in PIRSF and PANTHER and their constituent domains in SCOP and Pfam, initially for human protein-containing classes. The ProEvo classes and their relationships with GO terms will formalize the relationships between phytogeny and function, to allow more consistent and accurate inference of function based on experimental evidence. ProMod will define protein products generated by genetic variation, alternative splicing, proteolytic cleavage, and post-translational modification, initially for human and mouse proteins using fully-curated entries in UniProtKB/Swiss-Prot and the Mouse Genome Initiative, to support specific annotation of proteomes at the precise levels of variants, isoforms, and modified products. Defined based on scientific case studies of human and mouse disease proteins, the relations between PRO, GO, and other OBO ontologies, such as Disease Ontology, will capture the relationships required for disease understanding. For PRO .dissemination to the community, the ontology will be integrated into OBO, new relations will be added to the OBO Relations Ontology, and an annual protein ontology workshop will be organized. PRO will also be accessible from the PIR web site for integrative protein analysis. Through scientific meetings and collaborations, the PRO consortium will interact with the wider scientific community to ensure that PRO is useful and widely adopted. The PRO ontology will allow researchers to explore functional and evolutionary relationships of proteins to improve understanding of disease and identify potential diagnostic and therapeutic targets.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM080646-05
Application #
7841851
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Remington, Karin A
Project Start
2007-05-01
Project End
2011-09-20
Budget Start
2010-05-01
Budget End
2011-09-20
Support Year
5
Fiscal Year
2010
Total Cost
$507,072
Indirect Cost
Name
University of Delaware
Department
Biostatistics & Other Math Sci
Type
Schools of Engineering
DUNS #
059007500
City
Newark
State
DE
Country
United States
Zip Code
19716
Natale, Darren A; Arighi, Cecilia N; Blake, Judith A et al. (2017) Protein Ontology (PRO): enhancing and scaling up the representation of protein entities. Nucleic Acids Res 45:D339-D346
Arighi, Cecilia N; Drabkin, Harold; Christie, Karen R et al. (2017) Tutorial on Protein Ontology Resources. Methods Mol Biol 1558:57-78
Gurcan, Metin N; Tomaszewski, John; Overton, James A et al. (2017) Developing the Quantitative Histopathology Image Ontology (QHIO): A case study using the hot spot detection problem. J Biomed Inform 66:129-135
Zaru, Rossana; Magrane, Michele; O'Donovan, Claire et al. (2017) From the research laboratory to the database: the Caenorhabditis elegans kinome in UniProtKB. Biochem J 474:493-515
Wang, Qinghua; Ross, Karen E; Huang, Hongzhan et al. (2017) Analysis of Protein Phosphorylation and Its Functional Impact on Protein-Protein Interactions via Text Mining of the Scientific Literature. Methods Mol Biol 1558:213-232
The UniProt Consortium (2017) UniProt: the universal protein knowledgebase. Nucleic Acids Res 45:D158-D169
Ross, Karen E; Natale, Darren A; Arighi, Cecilia et al. (2016) Scalable Text Mining Assisted Curation of Post-Translationally Modified Proteoforms in the Protein Ontology. CEUR Workshop Proc 1747:
Huang, Jingshan; Gutierrez, Fernando; Strachan, Harrison J et al. (2016) OmniSearch: a semantic search system based on the Ontology for MIcroRNA Target (OMIT) for microRNA-target gene interaction data. J Biomed Semantics 7:25
Boutet, Emmanuel; Lieberherr, Damien; Tognolli, Michael et al. (2016) UniProtKB/Swiss-Prot, the Manually Annotated Section of the UniProt KnowledgeBase: How to Use the Entry View. Methods Mol Biol 1374:23-54
Bandrowski, Anita; Brinkman, Ryan; Brochhausen, Mathias et al. (2016) The Ontology for Biomedical Investigations. PLoS One 11:e0154556

Showing the most recent 10 out of 49 publications