This research plan describes the computational aspects of a strategy for predicting the substrate specificities of unknown enzymes from the genome projects in order to direct and facilitate experimental assignment of their functions. Anchored by functional predictions that are validated by the experimental projects, high- quality functional annotations can then be made for many additional sequences by annotation transfer. Focusing on non-trivial problems in function prediction, we have integrated our various expertises in bioinformatics, in silico clocking, and comparative structural modeling to achieve substantial success, contributing to the discovery of 32 new functions in the large and functionally diverse enolase and amidodhydrolase (AH) superfamilies, and annotation of hundreds of orthologous sequences by annotation transfer. In close collaboration with the experimental investigators, we will continue to develop an iterative cycle in which multiple parallel and serial paths are integrated to obtain high quality information useful for functional prediction.
We aim i n the next funding period to build on breakthroughs in docking against both experimentally determined and modeled structures, especially, to predict the functions of proteins in metabolic pathways in which these superfamily members (and those of a new target superfamily, the RuBisCO-like proteins) reside, thereby extending our efforts toward a more general solution for prediction of functional specificity. Proteins in these operons are expected to catalyze reactions in the pathway that can be linked to the fundamental chemical capabilities of our target superfamily members that are members of those operons, providing clues for metabolic context. Similarly, we can expect substrates for enzymes in the pathway to contain substructures related to those of our target superfamilies, providing additional clues for filtering docking results against these proteins. To take advantage of these similarities, new methods developed by our groups for comparison of ligand structures and substructures will be applied to docking hit lists to identify patterns in multiple proteins of an operon useful for restricting potential substrates for further evaluation and experimental testing. To the extent we succeed, this effort will lay the groundwork for generalization of our approaches for the discovery of new enzyme functions, new pathways, and new biology.

Public Health Relevance

Accurate prediction of molecular function for sequences in the genome projects is required to identify mechanisms of disease and improve drug discovery and development. In collaboration with the experimental projects, continuation of this computational project will contribute to this goal by applying orthogonal methods to correctly predict molecular function in large enzyme superfamilies and to extend those predictions on a large scale to determine the biological function of associated metabolic pathways.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Program Projects (P01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-BCMB-D)
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Illinois Urbana-Champaign
United States
Zip Code
Hobbs, Merlin Eric; Williams, Howard J; Hillerich, Brandan et al. (2014) l-Galactose metabolism in Bacteroides vulgatus from the human gut microbiota. Biochemistry 53:4661-70
Akiva, Eyal; Brown, Shoshana; Almonacid, Daniel E et al. (2014) The Structure-Function Linkage Database. Nucleic Acids Res 42:D521-30
Wichelecki, Daniel J; Graff, Dylan C; Al-Obaidi, Nawar et al. (2014) Identification of the in vivo function of the high-efficiency D-mannonate dehydratase in Caulobacter crescentus NA1000 from the enolase superfamily. Biochemistry 53:4087-9
Xiang, Dao Feng; Kumaran, Desigan; Swaminathan, Subramanyam et al. (2014) Structural characterization and function determination of a nonspecific carboxylate esterase from the amidohydrolase superfamily with a promiscuous ability to hydrolyze methylphosphonate esters. Biochemistry 53:3476-85
Wichelecki, Daniel J; Vendiola, Jean Alyxa Ferolin; Jones, Amy M et al. (2014) Investigating the physiological roles of low-efficiency D-mannonate and D-gluconate dehydratases in the enolase superfamily: pathways for the catabolism of L-gulonate and L-idonate. Biochemistry 53:5692-9
Bouvier, Jason T; Groninger-Poe, Fiona P; Vetting, Matthew et al. (2014) Galactaro ?-lactone isomerase: lactone isomerization by a member of the amidohydrolase superfamily. Biochemistry 53:614-6
Ghasempur, Salehe; Eswaramoorthy, Subramaniam; Hillerich, Brandan S et al. (2014) Discovery of a novel L-lyxonate degradation pathway in Pseudomonas aeruginosa PAO1. Biochemistry 53:3357-66
Groninger-Poe, Fiona P; Bouvier, Jason T; Vetting, Matthew W et al. (2014) Evolution of enzymatic activities in the enolase superfamily: galactarate dehydratase III from Agrobacterium tumefaciens C58. Biochemistry 53:4192-203
Cummings, Jennifer A; Vetting, Matthew; Ghodge, Swapnil V et al. (2014) Prospecting for unannotated enzymes: discovery of a 3',5'-nucleotide bisphosphate phosphatase within the amidohydrolase superfamily. Biochemistry 53:591-600
Wichelecki, Daniel J; Balthazor, Bryan M; Chau, Anthony C et al. (2014) Discovery of function in the enolase superfamily: D-mannonate and d-gluconate dehydratases in the D-mannonate dehydratase subgroup. Biochemistry 53:2722-31

Showing the most recent 10 out of 105 publications