Bioinformatics infrastructural activities are crucial to modern biological research. Complete and up-to-date databases of biological knowledge are vital for the increasingly information-dependent biological and biotechnological research. With the recent accumulation of genome sequences for many organisms, most notably the draft human sequence, attention has turned to the identification and function of proteins encoded by these genomes. In the Universal Protein Resource (UniProt) project, funded by the NIH, major European and American protein sequence databases have joined forces and developed a central resource for protein sequences and functions providing a cornerstone for a wide range of scientists active in modern biological research, especially in the field of proteomics. The broad, long-term objectives of this project are to provide with the Universal Protein Resource a stable and comprehensive resource for information on proteins, their sequences and their functions, to enable scientists to use UniProt to identify and analyze genes and their products and to make queries across databases containing complementary information, and to provide efficient and unencumbered access to the databases produced by the UniProt Consortium.
The specific aims are to maintain and further develop the UniProt Knowledgebase (UniProtKB) as the central database of curated protein sequences with annotations of sequence and functional information, to maintain and further develop the UniProt Archive (UniParc) and create the UniProtKB entry history server to ensure comprehensive coverage of all protein sequences and their annotation history, to maintain and further develop the UniProt Reference Clusters (UniRef) to provide a complete covering of sequence space while hiding redundant sequences (but not their descriptions) from view, to facilitate the use of these databases by providing user-friendly interfaces, tools for simple and complex queries and for retrieval of large datasets, down-loadable database records in defined, parsable format, and user support services;and to provide the flexibility and adaptability needed to be responsive to the changing needs of the scientific community. These databases produced by the UniProt Consortium will facilitate development of preventive and curative strategies for health maintenance by allowing researchers to integrate the enormous amount of data from the Human Genome Project and other genome projects as well as from structural and functional genomics and proteomics projects to understand the genetic and biological mechanisms causing human disease.
Lopez, Rodrigo; Cowley, Andrew; Li, Weizhong et al. (2014) Using EMBL-EBI Services via Web Interface and Programmatically via Web Services. Curr Protoc Bioinformatics 48:3.12.1-50 |
Mutowo-Meullenet, Prudence; Huntley, Rachael P; Dimmer, Emily C et al. (2013) Use of Gene Ontology Annotation to understand the peroxisome proteome in humans. Database (Oxford) 2013:bas062 |
Hirschman, Lynette; Burns, Gully A P C; Krallinger, Martin et al. (2012) Text mining for the biocuration workflow. Database (Oxford) 2012:bas020 |
Burmester, Anke; Shelest, Ekaterina; Glöckner, Gernot et al. (2011) Comparative and functional genomics provide insights into the pathogenicity of dermatophytic fungi. Genome Biol 12:R7 |
Sriranganadane, Dev; Waridel, Patrice; Salamin, Karine et al. (2011) Identification of novel secreted proteases during extracellular proteolysis by dermatophytes at acidic pH. Proteomics 11:4422-33 |
Chen, Chuming; Natale, Darren A; Finn, Robert D et al. (2011) Representative proteomes: a stable, scalable and unbiased proteome set for sequence analysis and functional annotation. PLoS One 6:e18910 |
Vasudevan, Sona; Vinayaka, C R; Natale, Darren A et al. (2011) Structure-guided rule-based annotation of protein functional sites in UniProt Knowledgebase. Methods Mol Biol 694:91-105 |
UniProt Consortium (2011) Ongoing and future developments at the Universal Protein Resource. Nucleic Acids Res 39:D214-9 |
Magrane, Michele; UniProt Consortium (2011) UniProt Knowledgebase: a hub of integrated protein data. Database (Oxford) 2011:bar009 |
Hu, Zhang-Zhi; Huang, Hongzhan; Wu, Cathy H et al. (2011) Omics-based molecular target and biomarker identification. Methods Mol Biol 719:547-71 |
Showing the most recent 10 out of 58 publications