The HAD superfamily is a large enzyme family (~19,000 nonredundant sequences) [1] of phosphotransferases (phosphomutases, ATPases and phosphatases) represented in all three kingdoms of life [2-4], and, within each cell, by a large number of homologs (28 in E. coli;35 in Salmonella typhimurium;31 in Pseudomonas aeruginosa;30 in Mycobacterium tuberculosis;31 in Bacillus cereus;24 in Bacteroides fragilis; 24 in Streptococcus pneumoniae;45 in Saccharomyces cerevisiae;84 in Caenorhabditis elegans;169 in Arabidopsis thaliana;292 in Selaginella moeltendorffii;183 in human). As many as 80-90% of the members are phosphatases [5], the vast majority of which have unknown functions. Approximately 40% of the bacterial metabolome is comprised of phosphorylated metabolites [6]. Phosphate substituents are common because they enhance the water solubility of the metabolite as well as its ability to bind to metabolic enzymes with high affinity and specificity. The removal of phosphate groups from phosphorylated metabolites is performed by phosphatases. The """"""""function"""""""" of a particular phosphatase is defined by the phosphorylated metabolite that it targets in the cell, i.e., by its """"""""physiological substrate"""""""". Thus, the HAD phosphatases meet the demands of cellular processes and metabolic pathways that involve phosphorylated macromolecules and metabolites. Divergence in HAD phosphatase funcfion is based on the divergence of the substrate-recognition elements. The substrate-recognition elements are separate from the catalytic scaffold, which is located in the core domain (Figure 1A). The four pepfide segments or """"""""motifs"""""""" which form the active site position the consen/ed Asp nucleophile, Asp acid/base, the Lys/Arg and Ser/Thr phosphate-binding residues and the Mg^* cofactor Asp/Glu binding residues (Figure IB). These residues, in combinafion with the scaffold main-chain elements, form a steric and electrostafic mold that stabilizes the trigonal bipyramidal transifion states/intermediates produced along the reaction pathway (Figure 1B) [7]. The HAD phosphatase substrate recognition elements are located in either a cap domain (as in HAD classes Cl and C2, also known as Type I and Type 11) tethered to the core domain by a solvated linker, or in short loop/helical segments that extend from the core domain (as in the """"""""capless"""""""" HAD class CO also known as Type III) (Figure 1A) [8]. Although HAD phosphatases possess the same catalytic site and proceed through the same second partial reaction, they are able to use the specific structural requirements of the substrate-binding step and the subsequent addition-eliminafion steps of the first partial reacfion to discriminate between the physiological substrate and other phosphorylated species (macromolecules and metabolites). The induced fit model, wherein substrate binding is followed by cap domain or loop closure, applies to most HAD phosphatases. Favorable electrostafic interaction between the substrate leaving group and the cap domain/gafing loops will contribute to the substrate-binding affinity. For efficient turnover, the phosphoryl group must be bound in the correct orientation within the catalyfic site. If the substrate-leaving group is too large or too small, nonproductive binding is likely to occur. Thus, the size, shape and electrostafic surface ofthe acfive site region that extends from the catalytic site to the active site entrance can provide significant insight into the identity of the physiological substrate. This serves as the basis for the use of virtual screening (made possible by the Structure Core and Computation Core) to identify candidates for the physiological substrate herein. Substrate specificities defined by experimental activity screens suggest that the typical HAD phosphatase has loose substrate specificity coupled with modest catalytic efficiency. Thus, acfivity screens alone often cannot idenfify the actual physiological substrate. Rather, they provide candidates that can be further interrogated using the tools provided by the Sequence/Genome Analysis Core and Microbiology Core.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Specialized Center--Cooperative Agreements (U54)
Project #
Application #
Study Section
Special Emphasis Panel (ZGM1-PPBC-3)
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Illinois Urbana-Champaign
United States
Zip Code
Gizzi, Anthony S; Grove, Tyler L; Arnold, Jamie J et al. (2018) A naturally occurring antiviral ribonucleotide encoded by the human genome. Nature 558:610-614
Kenney, Grace E; Dassama, Laura M K; Pandelia, Maria-Eirini et al. (2018) The biosynthesis of methanobactin. Science 359:1411-1416
Park, Yun Ji; Kenney, Grace E; Schachner, Luis F et al. (2018) Repurposed HisC Aminotransferases Complete the Biosynthesis of Some Methanobactins. Biochemistry 57:3515-3523
Calhoun, Sara; Korczynska, Magdalena; Wichelecki, Daniel J et al. (2018) Prediction of enzymatic pathways by integrative pathway mapping. Elife 7:
Sheng, Xiang; Patskovsky, Yury; Vladimirova, Anna et al. (2018) Mechanism and Structure of ?-Resorcylate Decarboxylase. Biochemistry 57:3167-3175
Zallot, RĂ©mi; Oberg, Nils O; Gerlt, John A (2018) 'Democratized' genomic enzymology web tools for functional assignment. Curr Opin Chem Biol 47:77-85
Barr, Ian; Stich, Troy A; Gizzi, Anthony S et al. (2018) X-ray and EPR Characterization of the Auxiliary Fe-S Clusters in the Radical SAM Enzyme PqqE. Biochemistry 57:1306-1315
Gerlt, John A (2017) Genomic Enzymology: Web Tools for Leveraging Protein Family Sequence-Function Space and Genome Context to Discover Novel Functions. Biochemistry 56:4293-4308
Koo, Byoung-Mo; Kritikos, George; Farelli, Jeremiah D et al. (2017) Construction and Analysis of Two Genome-Scale Deletion Libraries for Bacillus subtilis. Cell Syst 4:291-305.e7
Holliday, Gemma L; Brown, Shoshana D; Akiva, Eyal et al. (2017) Biocuration in the structure-function linkage database: the anatomy of a superfamily. Database (Oxford) 2017:

Showing the most recent 10 out of 91 publications