The functionally diverse EN superfamily is a paradigm for understanding how homologous enzymes with conserved active site architectures catalyze different reactions [1, 2]. The reactions are initiated by a conserved partial reaction: Mg -assisted abstraction of the a-proton of a carboxylate substrate to generate an enediolate intermediate stabilized by coordination to the Mg2+;the intermediate is directed to product by a reaction-specific acid. The abtive sites are located at the interface between 1) a (p/a)7P-barrel domain that contains the catalytic groups;and 2) a capping a+p domain that contains most ofthe substrate specificity determinants. The Mg2+ is coordinated to three consen/ed ligands (Asp/Glu) at the ends ofthe third, fourth, and fifth p-strands ofthe barrel domain and at least one carboxylate oxygen of the substrate;the active site of mandelate racemase (MR) is shown in the figure. A base at the end of the second, sixth, or seventh p-strand generates the enediolate intermediate that is stabilized by coordination to Mg2+. An acid at the end of the second, third, sixth, or seventh p-strand directs the intermediate to product. We now recognize seven functionally assigned subgroups that have different active site motifs, i.e., identities and locations of acid/base catalysts and metal ion ligands at the ends of the p-strands. More will be identified as "new" functions are assigned and structures are determined. The superfamily can be grouped into families with the Cytoscapevisualized sequence similarity network (SSN) analysis developed by the SFLD (Superfamily/Genome Core) [3]. Sequences are color-coded by function, with gray marking unknown function. In collaboration with NYSGXRC, structures were determined for 29 unknown members that leverage assignment of function. Most sequences can be associated with two functionally diverse subgroups designated the muconate lactonizing enzyme (MLE) and mandelate racemase (MR) subgroups. Four functions are known in the MLE subgroup: cycloisomerization (MLE), dehydration (o-succinylbenzoate synthase, OSBS), epimerization (L-Ala-ID/L-Glu epimerase, AEE) and racemization (N-succinylamino acid racemase, NSAR). The carboxylate oxygens of the substrate are bidentate ligands of the Mg2+ ion. Lys acid/base catalysts are located at the ends of the second and sixth p-strands of the barrel domain. Only 15% of the members (1258 total members;June 1, 2009) have unknown functions. Nine functions are known in the MR subgroup: racemization by MR and dehydration by seven families of acid sugar dehydratases in bacterial carbohydrate catabollsm (one bifunctional family). The substrates are a-OH acids, with one carboxylate oxygen and the a-OH providing bidentate ligands for the Mg2+. About 65% of the members (1310 total members) have unknown functions. Many members have unknown functions, with the number expanding as additional microbial genomes are sequenced. With the support of P01 GM071790 we developed and used computational approaches to predict and then experimentally assign 1) the N-succinyl Arg racemase (NSAR) function [4], 2) divergent substrate specificities for dipeptide epimerases [5] (also unpublished), and 3) a divergent galactarate dehydratase (unpublished). In the first two examples, a substrate-liganded AEE (pdb code 1TKK) was used as template for homology modeling, so functions were predicted from sequence;in the last example, a structure determined by NYSGXRC (pdb code 20QY) was used for in silico ligand docking.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Specialized Center--Cooperative Agreements (U54)
Project #
Application #
Study Section
Special Emphasis Panel (ZGM1-PPBC-3)
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Illinois Urbana-Champaign
United States
Zip Code
Grabowski, Marek; Niedzialkowska, Ewa; Zimmerman, Matthew D et al. (2016) The impact of structural genomics: the first quindecennial. J Struct Funct Genomics 17:1-16
Zhang, Xinshuai; Carter, Michael S; Vetting, Matthew W et al. (2016) Assignment of function to a domain of unknown function: DUF1537 is a new kinase family in catabolic pathways for acid sugars. Proc Natl Acad Sci U S A 113:E4161-9
Pan, Jian-Jung; Ramamoorthy, Gurusankar; Poulter, C Dale (2016) Absolute Configuration of Hydroxysqualene. An Intermediate in Bacterial Hopanoid Biosynthesis. Org Lett 18:512-5
Machovina, Melodie M; Usselman, Robert J; DuBois, Jennifer L (2016) Monooxygenase Substrates Mimic Flavin to Catalyze Cofactorless Oxygenations. J Biol Chem 291:17816-28
Yadava, Umesh; Vetting, Matthew W; Al Obaidi, Nawar et al. (2016) Structure of an ABC transporter solute-binding protein specific for the amino sugars glucosamine and galactosamine. Acta Crystallogr F Struct Biol Commun 72:467-72
Vetting, Matthew W; Bouvier, Jason T; Gerlt, John A et al. (2016) Purification, crystallization and structural elucidation of D-galactaro-1,4-lactone cycloisomerase from Agrobacterium tumefaciens involved in pectin degradation. Acta Crystallogr F Struct Biol Commun 72:36-41
Kim, Jungwook; Xiao, Hui; Koh, Junseock et al. (2015) Determinants of the CmoB carboxymethyl transferase utilized for selective tRNA wobble modification. Nucleic Acids Res 43:4602-13
London, Nir; Farelli, Jeremiah D; Brown, Shoshana D et al. (2015) Covalent docking predicts substrates for haloalkanoate dehalogenase superfamily phosphatases. Biochemistry 54:528-37
Wichelecki, Daniel J; Vetting, Matthew W; Chou, Liyushang et al. (2015) ATP-binding Cassette (ABC) Transport System Solute-binding Protein-guided Identification of Novel d-Altritol and Galactitol Catabolic Pathways in Agrobacterium tumefaciens C58. J Biol Chem 290:28963-76
Berman, Helen M; Gabanyi, Margaret J; Groom, Colin R et al. (2015) Data to knowledge: how to get meaning from your result. IUCrJ 2:45-58

Showing the most recent 10 out of 79 publications