This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator. In a prior core research project we developed analyses that identify the residues in biological macromolecules that confer specificity of interaction on the members of a paralogous family of molecules that carry out the same function with or upon different and distinct partners or substrates. The analysis assigns sequence residues to one of three categories: ' Category 1 consists of highly conserved sequence residues essential to the structure and activity of the entire homologous family of macromolecules. ' Category 2 consists of sequence residues that may vary freely. ' Category 3 consists of highly circumscribed sequence residues that maintain the specificity of the activity within the paralogous subfamilies. The analysis uses two different Kullback-Leibler distances, a family distance and a group distance, between multinomial distributions of residues in specific positions to assign alignment positions to the three categories. Both distances are information theory measures of entropy. The family Kullback-Leibler distance contrasts the distribution of residues in an entire column of the alignment with the expected distribution of a column from a random alignment of random sequences. The distribution of residues in a large sequence database such as NBRF-PIR provides this reference random distribution. The group Kullback-Leibler distance contrasts the residues in a single column within the specified subfamily of the alignment with the distribution of residues in the same column of the rest of the alignment. The Kullback-Leibler distance is given by the formula: SUM(i){(pi-qi)xlog2(pi/qi)} where pi and qi are the fractions of the residue of type i in the two distributions being contrasted. Category 1 residues have a high family Kullback-Leibler distance and a low group Kullback-Leibler distance. Category 3 residues have a have a low family Kullback-Leibler distance and a high group Kullback-Leibler distance. Category 2 residues are those where both the family and group Kullback-Leibler distances are low. The fourth possible category (category 4), alignment positions where both the family and group Kullback-Leibler distances are high are expected to be very rare and, in practice, are rarely observed. We generally define high Kullback-Leibler distance to be at least 3 standard deviations above the mean distance of all distances computed for that alignment. The initial analysis was on a group of 145 diverse aldehyde dehydrogenase (ALDH) sequences from at least 13 groups (Perozich et al., 1999), and is currently being applied to an alignment of 551 diverse sequences. Dr. John Hempel, University of Pittsburgh, has tested several predictions by site directed mutagenesis experiments. Our analysis identified Asp 247 as a category 2 residue critical to the activity of Class 3 subfamily of ALDHs that was previously known to lead to Sjogren-Larsson syndrome in humans, a rare genetic neurocutaneous disorder. Comparative analysis of crystal structures of different subfamilies of ALDH indicate that in the Class 3 ALDH Asp 247 is important in maintaining the relative positions of the catalytic thiol and the residues that bind the NAD cofactor as well as playing a role in positioning these residues relative to the substrate binding domain. Converting Asp 247 to Asn has a minimal effect on substrate binding but reduces Vmax by two orders of magnitude. Collaborator John Hempel is describing this analysis and biochemical validation of its predictions at the 11th International Symposium on the Enzymology and Molecular Biology of Carbonyl Metabolism in Ystad, Sweden this summer. This analysis has also been applied to 246 Glutathione S-transferases (GSTs), a family of enzymes that catalyze the addition of the tripeptide glutathione to endogenous and xenobiotic electrophilic substrates as part of the detoxication and metabolism of these compounds. The glutathione adducts produced have increased solubility in water and are subsequently enzymatically degraded to mercapturates and excreted. Several chemotherapeutic and immunosuppressent drugs not only are metabolized by GSTs but induce their production. This analysis was done to assist the collaboration between Dr. S. Singh of the University of Pittsburgh School of Medicine's Pharmacology department and the PSC Troy Wymore who are trying to discover what features of a drugs stereochemistry are related to inducing GST production. Critical features for determining GST class specificity are clustered at either end of the substrate binding site. In studying the GSTs we have also compared our analysis with two other techniques proposed to investigate the identities of residues that confer biological specificity on families of paralogous biomolecules, namely the principle components analysis proposed by Casari et al., 1995 and the Evolutionary Trace method proposed by Lichtarge et al., 1996. While ultimately the answer as to which method produces the best predictions will depend on experimental evaluation some significant differences have emerged. The Kulback Leibler distance ask a more focused biological question and provides a smaller, more focused set of predictions. Our Kullback-Leibler distance corresponds to the biological question of 'What sequence features in a family of sequences most completely discriminates a specific subfamily from other members of the family?' Sanders' (Casari et al., 1995) principle component analysis asks the question, 'What residues in the alignment provide the greatest source of variance among the sequences?', while the Evolutionary Trace method (Lichtarge et al., 1996) asks the question, 'How can we divide the alignment into groups that all show a high level of residue conservation?' We expect that our approach, which asks the most focused biological question, to yield better answers since we are searching for residues that are unique within a subfamily relative to the rest of the family, which will presumably provide selectivity in substrate binding. References: Casari, G., Sander, C. and Valencia, A., 1995. A method to predict functional residues in proteins. Nature Strucural Biology, 2:171-178 Lichtarge, O., Bourne, H.R. and Cohen, F.E. 1996. An Evolutionary Trace Method Defines Binding Surfaces Common to Protein Families. Journal of Molecular Bioliology, 257:342-358. Perozich, J., Nicholas, H.B., Wang, B.-C., Lindahl, R., and Hempel, J. 1999. Relationship within the Aldehyde Dehydrogenase Extended Family. Protein Science 8:137-146.

Agency
National Institute of Health (NIH)
Institute
National Center for Research Resources (NCRR)
Type
Biotechnology Resource Grants (P41)
Project #
2P41RR006009-16A1
Application #
7358390
Study Section
Special Emphasis Panel (ZRG1-BCMB-Q (40))
Project Start
2006-09-30
Project End
2007-07-31
Budget Start
2006-09-30
Budget End
2007-07-31
Support Year
16
Fiscal Year
2006
Total Cost
$1,012
Indirect Cost
Name
Carnegie-Mellon University
Department
Biostatistics & Other Math Sci
Type
Schools of Arts and Sciences
DUNS #
052184116
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213
Simakov, Nikolay A; Kurnikova, Maria G (2018) Membrane Position Dependency of the pKa and Conductivity of the Protein Ion Channel. J Membr Biol 251:393-404
Yonkunas, Michael; Buddhadev, Maiti; Flores Canales, Jose C et al. (2017) Configurational Preference of the Glutamate Receptor Ligand Binding Domain Dimers. Biophys J 112:2291-2300
Hwang, Wonmuk; Lang, Matthew J; Karplus, Martin (2017) Kinesin motility is driven by subdomain dynamics. Elife 6:
Earley, Lauriel F; Powers, John M; Adachi, Kei et al. (2017) Adeno-associated Virus (AAV) Assembly-Activating Protein Is Not an Essential Requirement for Capsid Assembly of AAV Serotypes 4, 5, and 11. J Virol 91:
Subramanian, Sandeep; Chaparala, Srilakshmi; Avali, Viji et al. (2016) A pilot study on the prevalence of DNA palindromes in breast cancer genomes. BMC Med Genomics 9:73
Ramakrishnan, N; Tourdot, Richard W; Radhakrishnan, Ravi (2016) Thermodynamic free energy methods to investigate shape transitions in bilayer membranes. Int J Adv Eng Sci Appl Math 8:88-100
Zhang, Yimeng; Li, Xiong; Samonds, Jason M et al. (2016) Relating functional connectivity in V1 neural circuits and 3D natural scenes using Boltzmann machines. Vision Res 120:121-31
Lee, Wei-Chung Allen; Bonin, Vincent; Reed, Michael et al. (2016) Anatomy and function of an excitatory network in the visual cortex. Nature 532:370-4
Murty, Vishnu P; Calabro, Finnegan; Luna, Beatriz (2016) The role of experience in adolescent cognitive development: Integration of executive, memory, and mesolimbic systems. Neurosci Biobehav Rev 70:46-58
Ramakrishnan, N; Radhakrishnan, Ravi (2015) Phenomenology based multiscale models as tools to understand cell membrane and organelle morphologies. Adv Planar Lipid Bilayers Liposomes 22:129-175

Showing the most recent 10 out of 292 publications