The mission of the Universal Protein Resource (UniProt) is to support biomedical research by providing a freely available, stable, comprehensive, richly and accurately annotated protein sequence knowledgebase (www.uniprot.org). UniProt integrates, interprets and standardizes data from a multitude of sources to achieve the most comprehensive catalog of protein sequences and functional annotation available to date, providing information from hundreds of thousands of publications for tens of millions of proteins from tens of thousands of species. The activities proposed here will increase the utility of UniProt for biomedical research and precision medicine. The expert curated functional information provided by UniProt is widely acknowledged to be of exceptional quality and is continuously updated as new knowledge becomes available.
Our first aim will be to continue to curate the scientific literature to ensure UniProt remains up to date. We will also work with the text-mining community to continue to improve curation efficiency. The curated records (0.5 million) are complemented by the (80 million) records for uncharacterized proteins. To ensure their usefulness for the community we will continue to develop our automatic annotation systems to annotate these proteins based on the knowledge of characterized proteins.
Our third aim i s to connect to and integrate protein data from resources around the world to make UniProt the worldwide global hub of protein information. The integration of clinical variation data as well as metabolomics information with proteins will help to support the multi-omics approaches of precision medicine.
Our fourth aim describes the production of the resource to ensure that our data is freely available according to the FAIR principles. UniProt forms a foundation for hundreds of life sciences data resources. Continuous software development is needed to ensure delivery of this key component of the life science infrastructure. The UniProt website is used by hundreds of thousands of scientists every month.
The final aim describes how we will enable this community to make best use of UniProt, through user training, outreach and improved user interfaces, driven by user testing.

Public Health Relevance

UniProt is the world?s leading resource of protein sequence and functional information, covering all species including Homo sapiens, model organisms and pathogens. UniProt annotates protein function using community-standard ontologies to support the interpretation of genomic and other datasets on which biomedical research and precision medicine depend. The activities described here will increase the utility of UniProt for biomedical research and precision medicine, enhancing research efficiency and understanding of human disease.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Resource-Related Research Projects--Cooperative Agreements (U24)
Project #
2U24HG007822-05
Application #
9527210
Study Section
Special Emphasis Panel (ZHG1)
Program Officer
Pillai, Ajay
Project Start
2014-09-18
Project End
2021-05-31
Budget Start
2018-06-01
Budget End
2019-05-31
Support Year
5
Fiscal Year
2018
Total Cost
Indirect Cost
Name
European Molecular Biology Laboratory
Department
Type
DUNS #
321691735
City
Heidelberg
State
Country
Germany
Zip Code
69117
Watkins, Xavier; Garcia, Leyla J; Pundir, Sangya et al. (2017) ProtVista: visualization of protein sequence annotations. Bioinformatics 33:2040-2041