The era of biotechnology is spawning a knowledge explosion of new and highly interrelated genomic information. Proteins, the expression of genes, are primarily responsible for the functioning of malfunctioning of the biological organism and, thus, are directly related to health and disease for humans. The broad, long-term objective of the Protein Information Resource (PIR) is to be a significant public domain resource for the scientific community that supports functional and structural genomic research.
The specific aims are (1) to provide comprehensive, timely, acutely annotated, fully classified extensively cross-referenced, and freely accessible PIR protein sequence and auxiliary databases, together with sequence analysis and family classification tools, (2) to further develop a bioinformatics infrastructure to support PIR database activities and to foster collaboration and communication with the scientific community, and (3) to establish robust two-way communication with the scientific community for accurate PIR annotation and for dissemination and evaluation. The key elements of the research design are: (I) Comprehensive classification of proteins into non-overlapping families and superfamilies, with identification of homology domains and functional motifs, to allow automatic, rapid, and accurate classification and annotation. (2) The PIR knowledge base system that contains comprehensive protein family, functional, and structural information, as well as linkage to multiple, heterogeneous external databases, in an object-relational database management system with a distributed object computing framework, and that integrates databases and data analysis and mining tools to facilitate protein information, exploration of protein structure and function, and accurate genome annotation. (3) The PIR Expert Scientific Council, a new paradigm for community involvement and participation in the maintenance of a public domain sequence database. Active community communication will allow PIR to obtain expert scientific contributions, acquire user requirements and evaluation, foster scientific cooperation and promote database interoperability, and provide wide system distribution. A successful completion of these aims will allow PIR to become an essential resource for basic scientific knowledge discovery towards the understanding of human health.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Biotechnology Resource Grants (P41)
Project #
2P41LM005798-06A1
Application #
6292612
Study Section
Special Emphasis Panel (ZLM1-SJP-A (J2))
Program Officer
Marron, Michael T
Project Start
1995-03-01
Project End
2002-09-30
Budget Start
2001-03-01
Budget End
2002-09-30
Support Year
6
Fiscal Year
2001
Total Cost
$1,323,805
Indirect Cost
Name
National Biomedical Research Foundation
Department
Type
DUNS #
City
Washington
State
DC
Country
United States
Zip Code
20007
Huang, Hongzhan; Barker, Winona C; Chen, Yongxing et al. (2003) iProClass: an integrated database of protein family, function and structure information. Nucleic Acids Res 31:390-2
Wu, C H; Xiao, C; Hou, Z et al. (2001) iProClass: an integrated, comprehensive and annotated protein classification database. Nucleic Acids Res 29:52-4
Srinivasarao, G Y; Yeh, L S; Marzec, C R et al. (1999) PIR-ALN: a database of protein sequence alignments. Bioinformatics 15:382-90
Srinivasarao, G Y; Yeh, L S; Marzec, C R et al. (1999) Database of protein sequence alignments: PIR-ALN. Nucleic Acids Res 27:284-5
Barker, W C; Garavelli, J S; McGarvey, P B et al. (1999) The PIR-International Protein Sequence Database. Nucleic Acids Res 27:39-43
Barker, W C; Garavelli, J S; Haft, D H et al. (1998) The PIR-International Protein Sequence Database. Nucleic Acids Res 26:27-32
George, D G; Dodson, R J; Garavelli, J S et al. (1997) The Protein Information Resource (PIR) and the PIR-International Protein Sequence Database. Nucleic Acids Res 25:24-8
Barker, W C; Hunt, L T (1997) Analysis and organization of protein sequence data: a retrospective spanning four decades. J Protein Chem 16:459-62
George, D G; Hunt, L T; Barker, W C (1996) PIR-International Protein Sequence Database. Methods Enzymol 266:41-59
Barker, W C; Pfeiffer, F; George, D G (1996) Superfamily classification in PIR-International Protein Sequence Database. Methods Enzymol 266:59-71

Showing the most recent 10 out of 11 publications