Certain key cellular constituents, which we term """"""""common proteins"""""""", are highly conserved across the major divisions of life and belong to vast protein classes that have functionally diverged into many hierarchically arranged subgroups. Thus each of these proteins manifests variations on recurring structural and/or mechanistic themes. These include, for example, AAA+ ATPases, P-loop GTPases and protein kinases, which were the focus of our preliminary studies due to their biomedical significance. AAA+ ATPases are associated with hereditary spastic paraplegia and the neurologic disorders torsin dystonia, Zellweger syndrome, neonatal adrenoleukodystrophy, and infantile Refsum disease. The GTPase Ras plays a key role in cancer, and protein kinases are important cancer and diabetes drug targets. A major goal of structural biology is to understand protein mechanisms in atomic-detail within the context of the living cell. Despite remarkable progress in determining protein structures, however, many aspects of underlying protein mechanisms remain unclear. Determining these mechanisms is a daunting task that will require many carefully chosen hypotheses and experiments to sort out. Such hypotheses are not formulated in a conceptual vacuum, however, but rather are based on clues obtained from preliminary observations. The functional constraints imposed on proteins during evolution are a potential source of information in this regard, inasmuch as these are due to and thus reflect underlying mechanisms. Moreover, because natural selection imposes these constraints on the genomic sequences of living organisms within their native environments, such information lacks the artifactual biases sometimes associated with in vitro experimental systems or with in vivo cell cultures and may reveal functionally critical features that have been overlooked due to the inherent limitations of current experimental methods. Thus the broad, long-range goal of this project is to characterize the functional constraints imposed on common proteins and to thereby provide clues to their underlying mechanisms as an aid to experimental design. Over the past decade we have developed and applied statistically rigorous procedures for characterizing complex patterns of sequence conservation across and within subgroups of related proteins. Using these and other approaches this project will accomplish the following specific aims: (i) Detect and very accurately align as many sequences as possible from several common protein classes, (ii) Identify, categorize and quantify the functional constraints acting on these proteins through statistical analysis of the alignments, (iii) Identify structural and chemical features associated with these constraints, (iv) Similarly analyze proteins that functionally interact with these proteins. And (v) propose molecular mechanisms based on these analyses.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM078541-06
Application #
7683169
Study Section
Special Emphasis Panel (ZRG1-BCMB-Q (02))
Program Officer
Lyster, Peter
Project Start
2006-09-01
Project End
2011-08-31
Budget Start
2009-09-01
Budget End
2011-08-31
Support Year
6
Fiscal Year
2009
Total Cost
$254,882
Indirect Cost
Name
University of Maryland Baltimore
Department
Biochemistry
Type
Schools of Medicine
DUNS #
188435911
City
Baltimore
State
MD
Country
United States
Zip Code
21201
Neuwald, Andrew F; Lanczycki, Christopher J; Marchler-Bauer, Aron (2012) Automated hierarchical classification of protein domain subfamilies based on functionally-divergent residue signatures. BMC Bioinformatics 13:144
Neuwald, Andrew F (2011) Surveying the manifold divergence of an entire protein class for statistical clues to underlying biochemical mechanisms. Stat Appl Genet Mol Biol 10:Article 36
Neuwald, Andrew F (2010) Bayesian classification of residues associated with protein functional divergence: Arf and Arf-like GTPases. Biol Direct 5:66
Iskow, Rebecca C; McCabe, Michael T; Mills, Ryan E et al. (2010) Natural mutagenesis of human genomes by endogenous retrotransposons. Cell 141:1253-61
Neuwald, Andrew F (2009) Rapid detection, classification and accurate alignment of up to a million or more related protein sequences. Bioinformatics 25:1869-75
Ammerman, Nicole C; Gillespie, Joseph J; Neuwald, Andrew F et al. (2009) A typhus group-specific protease defies reductive evolution in rickettsiae. J Bacteriol 191:7609-13
Neuwald, Andrew F (2009) The charge-dipole pocket: a defining feature of signaling pathway GTPase on/off switches. J Mol Biol 390:142-53
Neuwald, Andrew F (2009) The glycine brace: a component of Rab, Rho, and Ran GTPases associated with hinge regions of guanine- and phosphate-binding loops. BMC Struct Biol 9:11
Kannan, Natarajan; Neuwald, Andrew F; Taylor, Susan S (2008) Analogous regulatory sites within the alphaC-beta4 loop regions of ZAP-70 tyrosine kinase and AGC kinases. Biochim Biophys Acta 1784:27-32
Kannan, Natarajan; Haste, Nina; Taylor, Susan S et al. (2007) The hallmark of AGC kinase functional divergence is its C-terminal tail, a cis-acting regulatory module. Proc Natl Acad Sci U S A 104:1272-7

Showing the most recent 10 out of 13 publications