A method for detecting conserved motifs in protein sequence databases and assessing their statistical significance was developed and implemented in the CAP (Conssitent Alignment Parser) and MoST (Motif Search Tool) programs. The MoST procedure consists of iteratively abstracting from an alignment block a weight matrix representing the conserved motif, scanning the database with this matrix, and locating new segments to add to the alignment block. The last step is based upon the statistics of score distributions for position-dependent weight matrices. This techniques was applied to the analysis of several protein classes. We showed that eukaryotic translation elongation factor EF1g contains a domain related to glutathione S-transferases (GST). Two motifs that are conserved in a vast class of GST-related proteins are defined and a possible role for the GST activity of EF1g in the assembly of protein complexes involved in translation is proposed. We found that human tumor-specific nucleolar protein P120 contains a conserved S-adenosylmethionine-binding motif, and belongs to a family of putative rRNA methyltransferases that may be involved in the control of proliferation of both eukaryotic and bacterial cells. We explored the evolution of bacterial hydrolytic dehalogenases, which are crucial for detoxification of xenobiotics. Two types of dehalogenases were shown to belong to large, distinct superfamilies of enzymes found in all organisms, each of which includes many previously uncharacterized proteins. One of these superfamilies contains transferases and oxidoreductases, in addition to hydrolases, thereby revealing an evolutionary connection between different enzyme classes. Two new superfamilies of nucleosidases were characterized, as well as an unexpected structural and evolutionary relationship between thymidine phosphorylases and anthranilate phosphoribosyltransferases. We conclude that the use of motifs, particularly in the form of position-dependent weight matrices derived from alignment blocks, for sequence database screening results in extracting of a significant amount of new information as compared to standard procedures for pairwise similarity search. A general biological conclusion is that enzyme evolution involves a complex interplay between divergence and functional convergence. In many instances, evolutionarily related enzymes catalyze different reactions, whereas enzymes of similar specificity frequently appear to have different origins.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Intramural Research (Z01)
Project #
1Z01LM000061-01
Application #
3759328
Study Section
Project Start
Project End
Budget Start
Budget End
Support Year
1
Fiscal Year
1994
Total Cost
Indirect Cost
Name
National Library of Medicine
Department
Type
DUNS #
City
State
Country
United States
Zip Code
Ng, C Leong; Waterman, David G; Koonin, Eugene V et al. (2009) Conformational flexibility and molecular interactions of an archaeal homologue of the Shwachman-Bodian-Diamond syndrome protein. BMC Struct Biol 9:32
Yutin, Natalya; Wolf, Maxim Y; Wolf, Yuri I et al. (2009) The origins of phagocytosis and eukaryogenesis. Biol Direct 4:9
Wolf, Yuri I; Novichkov, Pavel S; Karev, Georgy P et al. (2009) Inaugural Article: The universal distribution of evolutionary rates of genes and distinct characteristics of eukaryotic genes of different apparent ages. Proc Natl Acad Sci U S A 106:7273-80
Koonin, Eugene V; Aravind, L (2009) Comparative genomics, evolution and origins of the nuclear envelope and nuclear pore complex. Cell Cycle 8:1984-5
Makarova, Kira S; Wolf, Yuri I; Koonin, Eugene V (2009) Comprehensive comparative-genomic analysis of type 2 toxin-antitoxin systems and related mobile stress response systems in prokaryotes. Biol Direct 4:19
Makarova, Kira S; Wolf, Yuri I; van der Oost, John et al. (2009) Prokaryotic homologs of Argonaute proteins are predicted to function as key components of a novel system of defense against mobile genetic elements. Biol Direct 4:29
Galperin, Michael Y (2008) Telling bacteria: do not LytTR. Structure 16:657-9
Hou, Shaobin; Makarova, Kira S; Saw, Jimmy H W et al. (2008) Complete genome sequence of the extremely acidophilic methanotroph isolate V4, Methylacidiphilum infernorum, a representative of the bacterial phylum Verrucomicrobia. Biol Direct 3:26
Basu, Malay Kumar; Carmel, Liran; Rogozin, Igor B et al. (2008) Evolution of protein domain promiscuity in eukaryotes. Genome Res 18:449-61
Elkins, James G; Podar, Mircea; Graham, David E et al. (2008) A korarchaeal genome reveals insights into the evolution of the Archaea. Proc Natl Acad Sci U S A 105:8102-7

Showing the most recent 10 out of 50 publications