Recent develops in automated techniques for DNA sequencing have led to an explosion of information on the complete sequences for the genomes of several organisms. Entire genomic sequences of 11 microorganisms are available now, and soon the genomes of almost three dozen additional organisms will be completed. These revolutionary data have stimulated mosaic research from the basic science, medical, and biotechnology communities that is focused on determining the essential complement of genetic information and functional attributes of an organism that is need to sustain life. A striking observation that has been made as each organism's genome is analyze is that almost one third of the putative open reading frames, although conserved among several organisms, encode for hypothetical proteins of no known function. The physiological function of the protein products represent a major gap in our understanding of the full complement of genetic information that is needed for the viability of any free living organism. The overall goal of this research program is to elucidate the function of roughly 50 proteins from a set of 65 bonafide hypothetical proteins from Haemophilus influenzae, an organism of moderate genetic size, by determining their high-resolution atomic structures. The first step is to develop high throughput methodology for subcloning the open reading frames that have already been screened for expressible polypeptides into high-level expression vectors to optimize the yields of soluble protein products. The second phase is to develop efficient protocols for high yield purification of native proteins of suitable quality and in sufficient quantities to begin large-scale crystallization studies. The final component is to characterize the targeted proteins in terms of their quaternary structure, and also in terms of their solubility and stability in solution. These data will permit protein targets to be identified for structure determination by NMR methods, provide clues for the crystallization of challenging proteins, and yield data on the physical and chemical properties of the hypothetical proteins that will be useful for functional determinations.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Program Projects (P01)
Project #
Application #
Study Section
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of MD Biotechnology Institute
United States
Zip Code
Zhao, Hong; Lim, Kap; Choudry, Anthony et al. (2012) Correlation of structure and function in the human hotdog-fold enzyme hTHEM4. Biochemistry 51:6490-2
Chen, Chen; Gorlatova, Natalia; Kelman, Zvi et al. (2011) Structures of p63 DNA binding domain in complexes with half-site and with spacer-containing full response elements. Proc Natl Acad Sci U S A 108:6456-61
Lim, Kap; Pullalarevu, Sadhana; Surabian, Karen Talin et al. (2010) Structural basis for the mechanism and substrate specificity of glycocyamine kinase, a phosphagen kinase family member. Biochemistry 49:2031-41
Chen, Chen; Sun, Qihong; Narayanan, Buvaneswari et al. (2010) Structure of oxalacetate acetylhydrolase, a virulence factor of the chestnut blight fungus. J Biol Chem 285:26685-96
Melamud, Eugene; Moult, John (2009) Stochastic noise in splicing machinery. Nucleic Acids Res 37:4873-86
Melamud, Eugene; Moult, John (2009) Structural implication of splicing stochastics. Nucleic Acids Res 37:4862-72
Willis, Mark A; Zhuang, Zhihao; Song, Feng et al. (2008) Structure of YciA from Haemophilus influenzae (HI0827), a hexameric broad specificity acyl-coenzyme A thioesterase. Biochemistry 47:2797-805
Chao, Kinlin L; Lim, Kap; Lehmann, Christopher et al. (2008) The Escherichia coli YdcF binds S-adenosyl-L-methionine and adopts an alpha/beta-fold characteristic of nucleotide-utilizing enzymes. Proteins 72:506-9
Zhuang, Zhihao; Song, Feng; Zhao, Hong et al. (2008) Divergence of function in the hot dog fold enzyme superfamily: the bacterial thioesterase YciA. Biochemistry 47:2789-96
Sari, Nese; He, Yanan; Doseeva, Victoria et al. (2007) Solution structure of HI1506, a novel two-domain protein from Haemophilus influenzae. Protein Sci 16:977-82

Showing the most recent 10 out of 52 publications