Structural genomics efforts aim to ultimately provide an experimental structure or good theoretical model of every tractable protein encoded by all sequenced genomes. Major efforts in this direction are now beginning, and the PRESAGE database aids coordination among groups and dissemination of results. PRESAGE records experimental structure determination underway (experimental annotations) and structural predictions and models (prediction annotations). As such, it provides a mechanism for coordination among different researchers without requiring centralization. It also aids dissemination of both experimental and computational structural genomics to a broad audience of biologists. PRESAGE was motivated by the need for scientific communication. While historically, structural biologists have often been reluctant to discuss projects underway, this attitude can disastrous when applied to large-scale projects. Already, there has been a duplication (and almost triplication) of effort in studying one protein; and another protein's structure has been solved because it was not apparent that its structure had already been accurately predicted. Early pre-releases of PRESAGE have shown that the American structural genomics community has been surprisingly receptive to sharing information about their experimental targets and results. Major international groups have also recently committed to submit targets to the system. We propose to reengineer PRESAGE from a """"""""proof-of-concept"""""""" prototype to a robust and reliable system. Most significantly, this will involve rewriting the whole database access system, which comprises most of PRESAGE except the user interface. In addition, several specific new services are planned, including customized systems for data collection (designed in collaboration with structural genomics researchers); family neighboring facilities; broader data collection; and flexible query systems with parseable output. We hope that PRESAGE will thus grow as an international resource for both producers of structural genomics data and for all those biologists who can use these data on genomics and protein structure to aid their research.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
1R01GM062621-01
Application #
6262639
Study Section
Biophysical Chemistry Study Section (BBCB)
Program Officer
Edmonds, Charles G
Project Start
2001-06-15
Project End
2003-05-31
Budget Start
2001-06-15
Budget End
2002-05-31
Support Year
1
Fiscal Year
2001
Total Cost
$112,800
Indirect Cost
Name
University of California Berkeley
Department
Other Basic Sciences
Type
Schools of Earth Sciences/Natur
DUNS #
094878337
City
Berkeley
State
CA
Country
United States
Zip Code
94704
Smith, Andrew; Chandonia, John-Marc; Brenner, Steven E (2006) ANDY: a general, fault-tolerant tool for database searching on computer clusters. Bioinformatics 22:618-20