The rapidly growing database of completely sequenced genomes of bacteria, archaea and eukaryotes (approximately 20 genomes available by the end of 1998 and many more in progress) creates both new opportunities and new challenges. In order to take advantage of this information, we developed a system of conserved protein families that include likely orthologous proteins (direct counterparts) from 8 completely sequenced genomes. The process of incorporation of all the remaining completely sequenced genomes is in progress. This system allows nearly automatic functional annotation of more 50% of the proteins encoded in each of the tested bacterial and archaeal genomes, though only about 20% of the eukaryotic proteins fit into these groups. In addition to functional prediction, this approach provides for the systematic delineation of the set of ancient, conserved protein families that are missing in any particular genome. Examination of evolutionary patterns (i.e. representation of different species iand phylogenetic lineages) in the families of orthologs suggests a major role of horizontal gene transfer and lineage specific gene loss in the evolution of prokaryotes. More specifically, comparative analysis of the first available genome of hyperthermophilic bacterium (Aquifex aeolicus) and archaeal genomes indicates massive gene exchange between archaeal and bacterial thermophiles. Comparative genomics has now become a part of any study on the evolution of a particular protein family or a functional system. Frequently examination of the phylogenetic distribution of structural domains and proteins with specific domain architectures provides for the possibility of detailed reconstructions of evolutionary scenarios. Such analyses were performed for the DNA repair systems of bacteria, archaea and eukaryotes and for the eukaryotic programmed cell death systems. Significant roles of horizontal gene transfer and multiple domain rearrangements in the evolution of both systems were demonstrated and a number of new functional predictions were made.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Intramural Research (Z01)
Project #
1Z01LM000073-03
Application #
6111075
Study Section
Special Emphasis Panel (CBB)
Project Start
Project End
Budget Start
Budget End
Support Year
3
Fiscal Year
1998
Total Cost
Indirect Cost
Name
National Library of Medicine
Department
Type
DUNS #
City
State
Country
United States
Zip Code
Ivankov, Dmitry N; Payne, Samuel H; Galperin, Michael Y et al. (2013) How many signal peptides are there in bacteria? Environ Microbiol 15:983-90
Rogozin, Igor B; Carmel, Liran; Csuros, Miklos et al. (2012) Origin and evolution of spliceosomal introns. Biol Direct 7:11
Mulkidjanian, Armen Y; Bychkov, Andrew Yu; Dibrova, Daria V et al. (2012) Open questions on the origin of life at anoxic geothermal fields. Orig Life Evol Biosph 42:507-16
Denoeud, France; Henriet, Simon; Mungpakdee, Sutada et al. (2010) Plasticity of animal genome architecture unmasked by rapid evolution of a pelagic tunicate. Science 330:1381-5
Lee, Renny C H; Gill, Erin E; Roy, Scott W et al. (2010) Constrained intron structures in a microsporidian. Mol Biol Evol 27:1979-82
Koonin, Eugene V (2009) Towards a postmodern synthesis of evolutionary biology. Cell Cycle 8:799-800
Yutin, Natalya; Wolf, Maxim Y; Wolf, Yuri I et al. (2009) The origins of phagocytosis and eukaryogenesis. Biol Direct 4:9
Wolf, Yuri I; Novichkov, Pavel S; Karev, Georgy P et al. (2009) Inaugural Article: The universal distribution of evolutionary rates of genes and distinct characteristics of eukaryotic genes of different apparent ages. Proc Natl Acad Sci U S A 106:7273-80
Koonin, Eugene V (2009) On the origin of cells and viruses: primordial virus world scenario. Ann N Y Acad Sci 1178:47-64
Basu, Malay Kumar; Poliakov, Eugenia; Rogozin, Igor B (2009) Domain mobility in proteins: functional and evolutionary implications. Brief Bioinform 10:205-16

Showing the most recent 10 out of 101 publications