The ACT Database is a database of structure-function relationships of proteins of diverse function. ACT describes, at a high level of detail, and in machine-readable form, the functions and molecular interactions of a large subset of proteins (currently 680) selected from the Protein Data Bank (PDB). ACT functional information is cross-referenced to 3-D structural information in the PDB. For each protein, ACT functional information includes the type of chemical interaction(s), the location of functional site(s) on the structure, and the contributions of individual site residues to the interactions. Consequently, each ACT entry contains much more detail than entries in other databases with functional information (such as ProSite, Swissprot, Enzyme, etc.) which annotate sequences, not structures. ACT also describes aspects of functional site structure which cannot be deduced from sequence-oriented databases; such as, how functional sites are shared across subunits (span subunit interfaces) and/or exist in multiple copies in homo-oligomers, and the presence of a substrate (or analog/molecular mimic of the substrate) in the structure of a complex. Because all fields in ACT are readable by computer programs, the database will have many applications in the era of structural genomics. These include: proteome-scale bioinformatic analysis of the mechanisms of molecular recognition; new methods to locate functional sites in proteins where the site is unknown (which will be increasingly common in the coming era); prediction of the type of interaction for proteins of unknown function; and the integration of structure prediction with function prediction. Our goals in this project include: 1.) Roughly doubling the size of the database, at which point all structures currently in PDB will be homologous to at least one ACT entry at a level of 25% sequence identity or better, which will by implication greatly expand the machine-readable functional information available for the known proteome; 2.) Developing user-friendly Web query forms so that the complex information in ACT can be easily accessed by general users over the Web; 3.) Converting ACT to a relational database (RDB) so that it can be queried by advanced users who know SQL syntax; 4.) Converting ACT to XML format to facilitate the exchange of the database with other institutions; and 5.) Linking ACT more closely to other databases, such as the Database of Interacting Proteins (DIP), based here at UCLA.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Research Project (R01)
Project #
5R01LM007878-03
Application #
6952817
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Florance, Valerie
Project Start
2003-09-30
Project End
2006-09-29
Budget Start
2005-09-30
Budget End
2006-09-29
Support Year
3
Fiscal Year
2005
Total Cost
$208,575
Indirect Cost
Name
University of California Los Angeles
Department
Chemistry
Type
Schools of Arts and Sciences
DUNS #
092530369
City
Los Angeles
State
CA
Country
United States
Zip Code
90095
Pettit, Frank K; Bare, Emiko; Tsai, Albert et al. (2007) HotPatch: a statistical approach to finding biologically relevant features on protein surfaces. J Mol Biol 369:863-79