The rapidly growing database of completely sequenced genomes of bacteria, archaea and eukaryotes (approximately 35 genomes available by the end of 2000 and many more in progress) creates both new opportunities and new challenges for genome research. In order to take advantage of this information, we developed a system of Clusters of Orthologous Groups of proteins (COGs) from 30 completely sequenced genomes. This database is being continuously updated to incorporate newly appearing genomes. The COG approach allows nearly automatic functional annotation of 60-80% of the proteins encoded in each of the tested bacterial and archaeal genomes, although only about 30% of the eukaryotic proteins fit into these groups. In addition to functional prediction, this approach provides for the systematic delineation of the set of ancient, conserved protein families that are missing in any particular genome. Examination of evolutionary patterns (i.e. representation of different species iand phylogenetic lineages) in the families of orthologs suggests a major role of horizontal gene transfer and lineage-specific gene loss in the evolution of prokaryotes. More specifically, we found evidence of massive horizontal gene among the archaea, between archaea and thermophilic bacteria and between bacterial parasites and their eukaryotic hosts. Additionally, we investigated in detail the lineage-specific gene expansions in eukaryotes and their possible adaptive significance and constructed a theoretical model of genome evolution, which gives a good agreement with empirical data on protein family sizes.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Intramural Research (Z01)
Project #
1Z01LM000073-07
Application #
6681352
Study Section
(CBB)
Project Start
Project End
Budget Start
Budget End
Support Year
7
Fiscal Year
2002
Total Cost
Indirect Cost
Name
National Library of Medicine
Department
Type
DUNS #
City
State
Country
United States
Zip Code
Ivankov, Dmitry N; Payne, Samuel H; Galperin, Michael Y et al. (2013) How many signal peptides are there in bacteria? Environ Microbiol 15:983-90
Rogozin, Igor B; Carmel, Liran; Csuros, Miklos et al. (2012) Origin and evolution of spliceosomal introns. Biol Direct 7:11
Mulkidjanian, Armen Y; Bychkov, Andrew Yu; Dibrova, Daria V et al. (2012) Open questions on the origin of life at anoxic geothermal fields. Orig Life Evol Biosph 42:507-16
Denoeud, France; Henriet, Simon; Mungpakdee, Sutada et al. (2010) Plasticity of animal genome architecture unmasked by rapid evolution of a pelagic tunicate. Science 330:1381-5
Lee, Renny C H; Gill, Erin E; Roy, Scott W et al. (2010) Constrained intron structures in a microsporidian. Mol Biol Evol 27:1979-82
Koonin, E V; Wolf, Y I; Puigbò, P (2009) The phylogenetic forest and the quest for the elusive tree of life. Cold Spring Harb Symp Quant Biol 74:205-13
Mulkidjanian, Armen Y; Galperin, Michael Y; Koonin, Eugene V (2009) Co-evolution of primordial membranes and membrane proteins. Trends Biochem Sci 34:206-15
Makarova, Kira S; Wolf, Yuri I; Koonin, Eugene V (2009) Comprehensive comparative-genomic analysis of type 2 toxin-antitoxin systems and related mobile stress response systems in prokaryotes. Biol Direct 4:19
Makarova, Kira S; Wolf, Yuri I; van der Oost, John et al. (2009) Prokaryotic homologs of Argonaute proteins are predicted to function as key components of a novel system of defense against mobile genetic elements. Biol Direct 4:29
Koonin, Eugene V; Makarova, Kira S (2009) CRISPR-Cas: an adaptive immunity system in prokaryotes. F1000 Biol Rep 1:95

Showing the most recent 10 out of 101 publications