The era of biotechnology is spawning a knowledge explosion of new and highly interrelated genomic information. Proteins, the expression of genes, are primarily responsible for the functioning of malfunctioning of the biological organism and, thus, are directly related to health and disease for humans. The broad, long-term objective of the Protein Information Resource (PIR) is to be a significant public domain resource for the scientific community that supports functional and structural genomic research.
The specific aims are (1) to provide comprehensive, timely, acutely annotated, fully classified extensively cross-referenced, and freely accessible PIR protein sequence and auxiliary databases, together with sequence analysis and family classification tools, (2) to further develop a bioinformatics infrastructure to support PIR database activities and to foster collaboration and communication with the scientific community, and (3) to establish robust two-way communication with the scientific community for accurate PIR annotation and for dissemination and evaluation. The key elements of the research design are: (I) Comprehensive classification of proteins into non-overlapping families and superfamilies, with identification of homology domains and functional motifs, to allow automatic, rapid, and accurate classification and annotation. (2) The PIR knowledge base system that contains comprehensive protein family, functional, and structural information, as well as linkage to multiple, heterogeneous external databases, in an object-relational database management system with a distributed object computing framework, and that integrates databases and data analysis and mining tools to facilitate protein information, exploration of protein structure and function, and accurate genome annotation. (3) The PIR Expert Scientific Council, a new paradigm for community involvement and participation in the maintenance of a public domain sequence database. Active community communication will allow PIR to obtain expert scientific contributions, acquire user requirements and evaluation, foster scientific cooperation and promote database interoperability, and provide wide system distribution. A successful completion of these aims will allow PIR to become an essential resource for basic scientific knowledge discovery towards the understanding of human health.
Huang, Hongzhan; Barker, Winona C; Chen, Yongxing et al. (2003) iProClass: an integrated database of protein family, function and structure information. Nucleic Acids Res 31:390-2 |
Wu, C H; Xiao, C; Hou, Z et al. (2001) iProClass: an integrated, comprehensive and annotated protein classification database. Nucleic Acids Res 29:52-4 |
Srinivasarao, G Y; Yeh, L S; Marzec, C R et al. (1999) PIR-ALN: a database of protein sequence alignments. Bioinformatics 15:382-90 |
Srinivasarao, G Y; Yeh, L S; Marzec, C R et al. (1999) Database of protein sequence alignments: PIR-ALN. Nucleic Acids Res 27:284-5 |
Barker, W C; Garavelli, J S; McGarvey, P B et al. (1999) The PIR-International Protein Sequence Database. Nucleic Acids Res 27:39-43 |
Barker, W C; Garavelli, J S; Haft, D H et al. (1998) The PIR-International Protein Sequence Database. Nucleic Acids Res 26:27-32 |
George, D G; Dodson, R J; Garavelli, J S et al. (1997) The Protein Information Resource (PIR) and the PIR-International Protein Sequence Database. Nucleic Acids Res 25:24-8 |
Barker, W C; Hunt, L T (1997) Analysis and organization of protein sequence data: a retrospective spanning four decades. J Protein Chem 16:459-62 |
Barker, W C; Pfeiffer, F; George, D G (1996) Superfamily classification in PIR-International Protein Sequence Database. Methods Enzymol 266:59-71 |
George, D G; Barker, W C; Mewes, H W et al. (1996) The PIR-International Protein Sequence Database. Nucleic Acids Res 24:17-20 |
Showing the most recent 10 out of 11 publications