Bioinformatics infrastructural activities are crucial to modern biological research. Complete and up-to-date databases of biological knowledge are vital for the increasingly information-dependent biological and biotechnological research. With the recent accumulation of genome sequences for many organisms, most notably the draft human sequence, attention has turned to the identification and function of proteins encoded by these genomes. In the Universal Protein Resource (UniProt) project, funded by the NIH, major European and American protein sequence databases have joined forces and developed a central resource for protein sequences and functions providing a cornerstone for a wide range of scientists active in modern biological research, especially in the field of proteomics. The broad, long-term objectives of this project are to provide with the Universal Protein Resource a stable and comprehensive resource for information on proteins, their sequences and their functions, to enable scientists to use UniProt to identify and analyze genes and their products and to make queries across databases containing complementary information, and to provide efficient and unencumbered access to the databases produced by the UniProt Consortium.
The specific aims are to maintain and further develop the UniProt Knowledgebase (UniProtKB) as the central database of curated protein sequences with annotations of sequence and functional information, to maintain and further develop the UniProt Archive (UniParc) and create the UniProtKB entry history server to ensure comprehensive coverage of all protein sequences and their annotation history, to maintain and further develop the UniProt Reference Clusters (UniRef) to provide a complete covering of sequence space while hiding redundant sequences (but not their descriptions) from view, to facilitate the use of these databases by providing user-friendly interfaces, tools for simple and complex queries and for retrieval of large datasets, down-loadable database records in defined, parsable format, and user support services;and to provide the flexibility and adaptability needed to be responsive to the changing needs of the scientific community. These databases produced by the UniProt Consortium will facilitate development of preventive and curative strategies for health maintenance by allowing researchers to integrate the enormous amount of data from the Human Genome Project and other genome projects as well as from structural and functional genomics and proteomics projects to understand the genetic and biological mechanisms causing human disease.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project--Cooperative Agreements (U01)
Project #
3U01HG002712-06S1
Application #
7886108
Study Section
Special Emphasis Panel (ZHG1-HGR-P (O3))
Program Officer
Bonazzi, Vivien
Project Start
2002-09-30
Project End
2010-08-31
Budget Start
2009-09-01
Budget End
2010-08-31
Support Year
6
Fiscal Year
2009
Total Cost
$5,952,504
Indirect Cost
Name
European Molecular Biology Laboratory
Department
Type
DUNS #
321691735
City
Heidelberg
State
Country
Germany
Zip Code
69117
Lopez, Rodrigo; Cowley, Andrew; Li, Weizhong et al. (2014) Using EMBL-EBI Services via Web Interface and Programmatically via Web Services. Curr Protoc Bioinformatics 48:3.12.1-50
Mutowo-Meullenet, Prudence; Huntley, Rachael P; Dimmer, Emily C et al. (2013) Use of Gene Ontology Annotation to understand the peroxisome proteome in humans. Database (Oxford) 2013:bas062
Hirschman, Lynette; Burns, Gully A P C; Krallinger, Martin et al. (2012) Text mining for the biocuration workflow. Database (Oxford) 2012:bas020
Vasudevan, Sona; Vinayaka, C R; Natale, Darren A et al. (2011) Structure-guided rule-based annotation of protein functional sites in UniProt Knowledgebase. Methods Mol Biol 694:91-105
UniProt Consortium (2011) Ongoing and future developments at the Universal Protein Resource. Nucleic Acids Res 39:D214-9
Magrane, Michele; UniProt Consortium (2011) UniProt Knowledgebase: a hub of integrated protein data. Database (Oxford) 2011:bar009
Hu, Zhang-Zhi; Huang, Hongzhan; Wu, Cathy H et al. (2011) Omics-based molecular target and biomarker identification. Methods Mol Biol 719:547-71
Burmester, Anke; Shelest, Ekaterina; Glöckner, Gernot et al. (2011) Comparative and functional genomics provide insights into the pathogenicity of dermatophytic fungi. Genome Biol 12:R7
Sriranganadane, Dev; Waridel, Patrice; Salamin, Karine et al. (2011) Identification of novel secreted proteases during extracellular proteolysis by dermatophytes at acidic pH. Proteomics 11:4422-33
Chen, Chuming; Natale, Darren A; Finn, Robert D et al. (2011) Representative proteomes: a stable, scalable and unbiased proteome set for sequence analysis and functional annotation. PLoS One 6:e18910

Showing the most recent 10 out of 58 publications