The functionally diverse EN superfamily is a paradigm for understanding how homologous enzymes with conserved active site architectures catalyze different reactions [1, 2]. The reactions are initiated by a conserved partial reaction: Mg -assisted abstraction of the a-proton of a carboxylate substrate to generate an enediolate intermediate stabilized by coordination to the Mg2+;the intermediate is directed to product by a reaction-specific acid. The abtive sites are located at the interface between 1) a (p/a)7P-barrel domain that contains the catalytic groups;and 2) a capping a+p domain that contains most ofthe substrate specificity determinants. The Mg2+ is coordinated to three consen/ed ligands (Asp/Glu) at the ends ofthe third, fourth, and fifth p-strands ofthe barrel domain and at least one carboxylate oxygen of the substrate;the active site of mandelate racemase (MR) is shown in the figure. A base at the end of the second, sixth, or seventh p-strand generates the enediolate intermediate that is stabilized by coordination to Mg2+. An acid at the end of the second, third, sixth, or seventh p-strand directs the intermediate to product. We now recognize seven functionally assigned subgroups that have different active site motifs, i.e., identities and locations of acid/base catalysts and metal ion ligands at the ends of the p-strands. More will be identified as "new" functions are assigned and structures are determined. The superfamily can be grouped into families with the Cytoscapevisualized sequence similarity network (SSN) analysis developed by the SFLD (Superfamily/Genome Core) [3]. Sequences are color-coded by function, with gray marking unknown function. In collaboration with NYSGXRC, structures were determined for 29 unknown members that leverage assignment of function. Most sequences can be associated with two functionally diverse subgroups designated the muconate lactonizing enzyme (MLE) and mandelate racemase (MR) subgroups. Four functions are known in the MLE subgroup: cycloisomerization (MLE), dehydration (o-succinylbenzoate synthase, OSBS), epimerization (L-Ala-ID/L-Glu epimerase, AEE) and racemization (N-succinylamino acid racemase, NSAR). The carboxylate oxygens of the substrate are bidentate ligands of the Mg2+ ion. Lys acid/base catalysts are located at the ends of the second and sixth p-strands of the barrel domain. Only 15% of the members (1258 total members;June 1, 2009) have unknown functions. Nine functions are known in the MR subgroup: racemization by MR and dehydration by seven families of acid sugar dehydratases in bacterial carbohydrate catabollsm (one bifunctional family). The substrates are a-OH acids, with one carboxylate oxygen and the a-OH providing bidentate ligands for the Mg2+. About 65% of the members (1310 total members) have unknown functions. Many members have unknown functions, with the number expanding as additional microbial genomes are sequenced. With the support of P01 GM071790 we developed and used computational approaches to predict and then experimentally assign 1) the N-succinyl Arg racemase (NSAR) function [4], 2) divergent substrate specificities for dipeptide epimerases [5] (also unpublished), and 3) a divergent galactarate dehydratase (unpublished). In the first two examples, a substrate-liganded AEE (pdb code 1TKK) was used as template for homology modeling, so functions were predicted from sequence;in the last example, a structure determined by NYSGXRC (pdb code 20QY) was used for in silico ligand docking.

National Institute of Health (NIH)
Specialized Center--Cooperative Agreements (U54)
Project #
Application #
Study Section
Special Emphasis Panel (ZGM1)
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Illinois Urbana-Champaign
United States
Zip Code
Mashiyama, Susan T; Malabanan, M Merced; Akiva, Eyal et al. (2014) Large-scale determination of sequence, structure, and function relationships in cytosolic glutathione transferases across the biosphere. PLoS Biol 12:e1001843
Akiva, Eyal; Brown, Shoshana; Almonacid, Daniel E et al. (2014) The Structure-Function Linkage Database. Nucleic Acids Res 42:D521-30
Zheng, Heping; Hou, Jing; Zimmerman, Matthew D et al. (2014) The future of crystallography in drug discovery. Expert Opin Drug Discov 9:125-37
Wichelecki, Daniel J; Graff, Dylan C; Al-Obaidi, Nawar et al. (2014) Identification of the in vivo function of the high-efficiency D-mannonate dehydratase in Caulobacter crescentus NA1000 from the enolase superfamily. Biochemistry 53:4087-9
Dong, Guang Qiang; Calhoun, Sara; Fan, Hao et al. (2014) Prediction of substrates for glutathione transferases by covalent docking. J Chem Inf Model 54:1687-99
Wichelecki, Daniel J; Vendiola, Jean Alyxa Ferolin; Jones, Amy M et al. (2014) Investigating the physiological roles of low-efficiency D-mannonate and D-gluconate dehydratases in the enolase superfamily: pathways for the catabolism of L-gulonate and L-idonate. Biochemistry 53:5692-9
Bouvier, Jason T; Groninger-Poe, Fiona P; Vetting, Matthew et al. (2014) Galactaro ?-lactone isomerase: lactone isomerization by a member of the amidohydrolase superfamily. Biochemistry 53:614-6
Wichelecki, Daniel J; Froese, D Sean; Kopec, Jolanta et al. (2014) Enzymatic and structural characterization of rTS? provides insights into the function of rTS?. Biochemistry 53:2732-8
Pandya, Chetanya; Dunaway-Mariano, Debra; Xia, Yu et al. (2014) Structure-guided approach for detecting large domain inserts in protein sequences as illustrated using the haloacid dehalogenase superfamily. Proteins 82:1896-906
Kumar, Ritesh; Zhao, Suwen; Vetting, Matthew W et al. (2014) Prediction and biochemical demonstration of a catabolic pathway for the osmoprotectant proline betaine. MBio 5:e00933-13

Showing the most recent 10 out of 49 publications