This research plan describes the computational aspects of a strategy for predicting the substrate specificities of unknown enzymes from the genome projects in order to direct and facilitate experimental assignment of their functions. Anchored by functional predictions that are validated by the experimental projects, high- quality functional annotations can then be made for many additional sequences by annotation transfer. Focusing on non-trivial problems in function prediction, we have integrated our various expertises in bioinformatics, in silico clocking, and comparative structural modeling to achieve substantial success, contributing to the discovery of 32 new functions in the large and functionally diverse enolase and amidodhydrolase (AH) superfamilies, and annotation of hundreds of orthologous sequences by annotation transfer. In close collaboration with the experimental investigators, we will continue to develop an iterative cycle in which multiple parallel and serial paths are integrated to obtain high quality information useful for functional prediction.
We aim i n the next funding period to build on breakthroughs in docking against both experimentally determined and modeled structures, especially, to predict the functions of proteins in metabolic pathways in which these superfamily members (and those of a new target superfamily, the RuBisCO-like proteins) reside, thereby extending our efforts toward a more general solution for prediction of functional specificity. Proteins in these operons are expected to catalyze reactions in the pathway that can be linked to the fundamental chemical capabilities of our target superfamily members that are members of those operons, providing clues for metabolic context. Similarly, we can expect substrates for enzymes in the pathway to contain substructures related to those of our target superfamilies, providing additional clues for filtering docking results against these proteins. To take advantage of these similarities, new methods developed by our groups for comparison of ligand structures and substructures will be applied to docking hit lists to identify patterns in multiple proteins of an operon useful for restricting potential substrates for further evaluation and experimental testing. To the extent we succeed, this effort will lay the groundwork for generalization of our approaches for the discovery of new enzyme functions, new pathways, and new biology.

Public Health Relevance

Accurate prediction of molecular function for sequences in the genome projects is required to identify mechanisms of disease and improve drug discovery and development. In collaboration with the experimental projects, continuation of this computational project will contribute to this goal by applying orthogonal methods to correctly predict molecular function in large enzyme superfamilies and to extend those predictions on a large scale to determine the biological function of associated metabolic pathways.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Program Projects (P01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1)
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Illinois Urbana-Champaign
United States
Zip Code
Holliday, Gemma L; Brown, Shoshana D; Akiva, Eyal et al. (2017) Biocuration in the structure-function linkage database: the anatomy of a superfamily. Database (Oxford) 2017:
Holliday, Gemma L; Brown, Shoshana D; Akiva, Eyal et al. (2017) Biocuration in the structure-function linkage database: the anatomy of a superfamily. Database (Oxford) 2017:
Webb, Benjamin; Sali, Andrej (2016) Comparative Protein Structure Modeling Using MODELLER. Curr Protoc Bioinformatics 54:5.6.1-5.6.37
Vladimirova, Anna; Patskovsky, Yury; Fedorov, Alexander A et al. (2016) Substrate Distortion and the Catalytic Reaction Mechanism of 5-Carboxyvanillate Decarboxylase. J Am Chem Soc 138:826-36
Fedorov, Alexander A; Martí-Arbona, Ricardo; Nemmara, Venkatesh V et al. (2015) Structure of N-formimino-L-glutamate iminohydrolase from Pseudomonas aeruginosa. Biochemistry 54:890-7
Xiang, Dao Feng; Patskovsky, Yury; Nemmara, Venkatesh V et al. (2015) Function discovery and structural characterization of a methylphosphonate esterase. Biochemistry 54:2919-30
Zhang, Xinshuai; Kumar, Ritesh; Vetting, Matthew W et al. (2015) A unique cis-3-hydroxy-l-proline dehydratase in the enolase superfamily. J Am Chem Soc 137:1388-91
Akiva, Eyal; Brown, Shoshana; Almonacid, Daniel E et al. (2014) The Structure-Function Linkage Database. Nucleic Acids Res 42:D521-30
Korczynska, Magdalena; Xiang, Dao Feng; Zhang, Zhening et al. (2014) Functional annotation and structural characterization of a novel lactonase hydrolyzing D-xylono-1,4-lactone-5-phosphate and L-arabino-1,4-lactone-5-phosphate. Biochemistry 53:4727-38
Brown, Shoshana; Babbitt, Patricia (2014) Using the structure-function linkage database to characterize functional domains in enzymes. Curr Protoc Bioinformatics 48:2.10.1-16

Showing the most recent 10 out of 120 publications