The specific aim of this UniProt Consortium is to provide a centralized protein sequence and function resource by enhancing the UniProt Knowledgebase (UniProtKB) and ensuring that the diverse information in UniProt is of use to a broad scientific user community by exploiting a range of dissemination strategies. The UniProtKB will include a variety of data types including, but not limited to, protein sequences, nomenclature, family classifications, and alternatively-spliced and modified forms. Relevant information on protein function will be included with potential protein interactions, expression patterns, pathways and controlled vocabularies of Gene Ontology (GO terms). Annotation methods applied in the UniProtKB will include extraction of information from the literature and computational analyses, as well as integrating and mining large-scale data sets. The types of evidence and methods of annotation for both experimental and computational data along with attribution of the source will be included. The UniProtKB will rely on high interoperability with other databases, while exploiting novel approaches to encourage community curation. To facilitate the use of UniProt, the UniProt Consortium will enhance its existing user-friendly interfaces and tools to allow for simple and complex queries and for retrieval of large datasets. Database records will be down-loadable in defined, parsable format. An efficient and responsive user support service will be provided. Finally, the UniProt Consortium will exert the flexibility and adaptability needed to respond to changing needs of the scientific community. The broad, long-term objectives of this project are: To provide the scientific community with the Universal Protein Resource (UniProt) as a comprehensive, high-quality and freely accessible resource of protein sequence and functional information. To enable scientists to identify and analyze products of protein-coding genes by making text- and sequence-based queries in the UniProt databases. To provide efficient and unencumbered access to the databases produced by the UniProt Consortium.

Public Health Relevance

The databases produced by the UniProt Consortium will provide researchers with an integrated access to protein sequence and function by gathering and enriching data from genomics and proteomics projects as well as the results published by individual researchers. This is a crucial step in making genomics and proteomics research results easily accessible to support biomedical research in academia and industry and hence facilitate the development of preventive and curative strategies for human health.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Biotechnology Resource Cooperative Agreements (U41)
Project #
4U41HG006104-04
Application #
8696864
Study Section
Special Emphasis Panel (ZHG1-HGR-M (O2))
Program Officer
Bonazzi, Vivien
Project Start
2010-09-27
Project End
2014-04-30
Budget Start
2013-08-01
Budget End
2014-04-30
Support Year
4
Fiscal Year
2013
Total Cost
$3,945,000
Indirect Cost
$118,350
Name
European Molecular Biology Laboratory
Department
Type
DUNS #
321691735
City
Heidelberg
State
Country
Germany
Zip Code
69117
Herrero, Javier; Muffato, Matthieu; Beal, Kathryn et al. (2016) Ensembl comparative genomics resources. Database (Oxford) 2016:
Pundir, Sangya; Martin, Maria J; O'Donovan, Claire et al. (2016) UniProt Tools. Curr Protoc Bioinformatics 53:1.29.1-15
Boutet, Emmanuel; Lieberherr, Damien; Tognolli, Michael et al. (2016) UniProtKB/Swiss-Prot, the Manually Annotated Section of the UniProt KnowledgeBase: How to Use the Entry View. Methods Mol Biol 1374:23-54
UniProt Consortium (2015) UniProt: a hub for protein information. Nucleic Acids Res 43:D204-12
Bastian, Frederic B; Chibucos, Marcus C; Gaudet, Pascale et al. (2015) The Confidence Information Ontology: a step towards a standard for asserting confidence in annotations. Database (Oxford) 2015:bav043
Suzek, Baris E; Wang, Yuqi; Huang, Hongzhan et al. (2015) UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics 31:926-32
Pedruzzi, Ivo; Rivoire, Catherine; Auchincloss, Andrea H et al. (2015) HAMAP in 2015: updates to the protein family classification and annotation system. Nucleic Acids Res 43:D1064-70
Pundir, Sangya; Magrane, Michele; Martin, Maria J et al. (2015) Searching and Navigating UniProt Databases. Curr Protoc Bioinformatics 50:1.27.1-10
Huntley, Rachael P; Sawford, Tony; Mutowo-Meullenet, Prudence et al. (2015) The GOA database: gene Ontology annotation updates for 2015. Nucleic Acids Res 43:D1057-63
Foulger, R E; Osumi-Sutherland, D; McIntosh, B K et al. (2015) Representing virus-host interactions and other multi-organism processes in the Gene Ontology. BMC Microbiol 15:146

Showing the most recent 10 out of 34 publications