The long-term objective of the UniProt Consortium is to provide a centralized curated, accurate, stable, and comprehensive protein sequence and function resource by enhancing the UniProt Knowledgebase (UniProtKB) and ensuring that the diverse information in UniProt will be of use to a broad scientific user community by exploiting a range of dissemination strategies. This objective will enable new scientific discoveries in the fields of biomedical research that will enhance human health.
The specific aims of the project are: 1. To provide the experimental information associated with proteins through manual curation of the scientific literature. This will be achieved by identifying and curating newly experimentally characterized proteins and by updating existing proteins for new information. Interaction with other manually curated resources will be extended, and community standard interfaces used. 2. To automatically annotate experimentally uncharacterized proteins, increasing both coverage and depth. This will enhance the value of the exponentially growing protein sequence space and allow for the correction of erroneous information. UniProt's rule based system will be made available to the community. 3. To act as the global central hub for protein information. This will be achieved by leveraging data from other resources either as imports or as links/visualization and organizing the data for its optimum use and navigation. New data types will be included as they become available, and the technical infrastructure will be developed to scale with the continued growth in protein sequences. 4. To maintain and further develop UniProt's website and other services. This will be done by creating or incorporating specialized visualizations from other resources for a more intuitive and better overview, and by providing appropriate web services and formats to facilitate the extraction of UniProt's rich data. 5. To actively seek user feedback from and provide training for existing and new user communities. Designated user experience testing and the engagement of new user communities will help to define future directions. Target research communities include researchers working on genetic variation, medical researchers and clinicians, cell biologists and drug discovery/ pharmaceutical researchers.

Public Health Relevance

The databases produced by the UniProt Consortium provide researchers with an integrated access to protein sequence and function by gathering and enriching data from genomics and proteomics projects as well as the results published by individual researchers. This is a crucial step in making genomics and proteomics research results easily accessible to support biomedical research in academia and industry and hence facilitate the development of preventive and curative strategies for human health.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Biotechnology Resource Cooperative Agreements (U41)
Project #
Application #
Study Section
Special Emphasis Panel (ZHG1-HGR-M (J2))
Program Officer
Pillai, Ajay
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
European Molecular Biology Laboratory
Research Institutes
Zip Code
Pichler, Klemens; Warner, Kate; Magrane, Michele et al. (2018) SPIN: Submitting Sequences Determined at Protein Level to UniProt. Curr Protoc Bioinformatics 62:e52
Ding, Ruoyao; Boutet, Emmanuel; Lieberherr, Damien et al. (2017) eGenPub, a text mining system for extending computationally mapped bibliography for UniProt Knowledgebase by capturing centrality. Database (Oxford) 2017:
Poux, Sylvain; Arighi, Cecilia N; Magrane, Michele et al. (2017) On expert curation and scalability: UniProtKB/Swiss-Prot as a case study. Bioinformatics 33:3454-3460
The UniProt Consortium (2017) UniProt: the universal protein knowledgebase. Nucleic Acids Res 45:D158-D169
Hulo, Chantal; Masson, Patrick; Toussaint, Ariane et al. (2017) Bacterial Virus Ontology; Coordinating across Databases. Viruses 9:
Pundir, Sangya; Martin, Maria J; O'Donovan, Claire (2017) UniProt Protein Knowledgebase. Methods Mol Biol 1558:41-55
Zaru, Rossana; Magrane, Michele; O'Donovan, Claire et al. (2017) From the research laboratory to the database: the Caenorhabditis elegans kinome in UniProtKB. Biochem J 474:493-515
Chen, Chuming; Huang, Hongzhan; Wu, Cathy H (2017) Protein Bioinformatics Databases and Resources. Methods Mol Biol 1558:3-39
Watkins, Xavier; Garcia, Leyla J; Pundir, Sangya et al. (2017) ProtVista: visualization of protein sequence annotations. Bioinformatics 33:2040-2041
K?l?ç, Sefa; Sagitova, Dinara M; Wolfish, Shoshannah et al. (2016) From data repositories to submission portals: rethinking the role of domain-specific databases in CollecTF. Database (Oxford) 2016:

Showing the most recent 10 out of 30 publications