The rapidly growing database of completely and nearly completely sequenced genomes of bacteria, archaea, eukaryotes and viruses (several thousand genomes already available and many more in progress) creates both extensive new opportunities and major new challenges for genome research. During the last year, we performed a variety of studies that took advantage of the genomic information to establish fundamental principles of genome evolution. We combined mathematical modeling of genome evolution with comparative analysis of prokaryotic genomes to estimate the relative contributions of selection and intrinsic loss bias to the evolution of different functional classes of genes and mobile genetic elements (MGE). An exact solution for the dynamics of gene family size was obtained under a linear duplication-transfer-loss model with selection. With the exception of genes involved in information processing, particularly translation, which are maintained by strong selection, the average selection coefficient for most nonparasitic genes is low albeit positive, compatible with observed positive correlation between genome size and effective population size. Free-living microbes evolve under stronger selection for gene retention than parasites. Different classes of MGE show a broad range of fitness effects, from the nearly neutral transposons to prophages, which are actively eliminated by selection. Genes involved in antiparasite defense, on average, incur a fitness cost to the host that is at least as high as the cost of plasmids. This cost is probably due to the adverse effects of autoimmunity and curtailment of horizontal gene transfer caused by the defense systems and selfish behavior of some of these systems, such as toxin-antitoxin and restriction modification modules. Transposons follow a biphasic dynamics, with bursts of gene proliferation followed by decay in the copy number that is quantitatively captured by the model. The horizontal gene transfer to loss ratio, but not duplication to loss ratio, correlates with genome size, potentially explaining increased abundance of neutral and costly elements in larger genomes. The evolution of bacterial and archaeal genomes is highly dynamic and involves extensive horizontal gene transfer and gene loss. Furthermore, many microbial species appear to have open pangenomes, where each newly sequenced genome contains more than 10% ORFans, that is, genes without detectable homologues in other species5,6. Here, we report a quantitative analysis of microbial genome evolution by fitting the parameters of a simple, steady-state evolutionary model to the comparative genomic data on the gene content and gene order similarity between archaeal genomes. The results reveal two sharply distinct classes of microbial genes, one of which is characterized by effectively instantaneous gene replacement, and the other consists of genes with finite, distributed replacement rates. These findings imply a conservative estimate of the size of the prokaryotic genomic universe, which appears to consist of at least a billion distinct genes. Furthermore, the same distribution of constraints is shown to govern the evolution of gene complement and gene order, without the need to invoke long-range conservation or the selfish operon concept. Much of our work aimed at understanding evolution of viruses and mobile elements. Among other developments in this area, a survey of bacterial and archaeal genomes shows that many Tn7-like transposons contain minimal type I-F CRISPR-Cas systems that consist of fused cas8f and cas5f, cas7f, and cas6f genes and a short CRISPR array. Several small groups of Tn7-like transposons encompass similarly truncated type I-B CRISPR-Cas. This minimal gene complement of the transposon-associated CRISPR-Cas systems implies that they are competent for pre-CRISPR RNA (precrRNA) processing yielding mature crRNAs and target binding but not target cleavage that is required for interference. Phylogenetic analysis demonstrates that evolution of the CRISPR-Cas-containing transposons included a single, ancestral capture of a type I-F locus and two independent instances of type I-B loci capture. We showed that the transposon-associated CRISPR arrays contain spacers homologous to plasmid and temperate phage sequences and, in some cases, chromosomal sequences adjacent to the transposon. We hypothesized that the transposon-encoded CRISPR-Cas systems generate displacement (R-loops) in the cognate DNA sites, targeting the transposon to these sites and thus facilitating their spread via plasmids and phages. These findings suggest the existence of RNA-guided transposition and fit the guns-for-hire concept whereby mobile genetic elements capture host defense systems and repurpose them for different stages in the life cycle of the element. The rapidly growing metagenomic databases have become a rich information source for discovery of new groups of viruses and microbes. We have performed several projects in this direction. One of these involved the discovery of a previously unknown but apparently abundant and ecologically important group of viruses. Marine group II Euryarchaeota (MG-II) are among the most abundant microbes in oceanic surface waters. So far, however, representatives of MG-II have not been cultivated, and no viruses infecting these organisms have been described. Here, we present complete genomes for three distinct groups of viruses assembled from metagenomic sequence datasets highly enriched for MG-II. These novel viruses, which we denote magroviruses, possess double-stranded DNA genomes of 65 to 100 kilobases in size that encode a structural module characteristic of head-tailed viruses and, unusually for archaeal and bacterial viruses, a nearly complete replication apparatus of apparent archaeal origin. The newly identified magroviruses are widespread and abundant and therefore are likely to be major ecological agents. Taken together, these studies advance the existing understanding of the general principles and specific aspects of genome evolution in diverse life forms, in particular viruses and mobile elements, and provide new insights into general principles of genome evolution.

Project Start
Project End
Budget Start
Budget End
Support Year
22
Fiscal Year
2017
Total Cost
Indirect Cost
Name
National Library of Medicine
Department
Type
DUNS #
City
State
Country
Zip Code
Krupovic, Mart; Cvirkaite-Krupovic, Virginija; Iranzo, Jaime et al. (2018) Viruses of archaea: Structural, functional, environmental and evolutionary genomics. Virus Res 244:181-193
Yutin, Natalya; Makarova, Kira S; Gussow, Ayal B et al. (2018) Discovery of an expansive bacteriophage family that includes the most abundant viruses from the human gut. Nat Microbiol 3:38-46
He, Fei; Bhoobalan-Chitty, Yuvaraj; Van, Lan B et al. (2018) Anti-CRISPR proteins encoded by archaeal lytic viruses inhibit subtype I-D immunity. Nat Microbiol 3:461-469
Shmakov, Sergey A; Makarova, Kira S; Wolf, Yuri I et al. (2018) Systematic prediction of genes functionally linked to CRISPR-Cas systems by gene neighborhood analysis. Proc Natl Acad Sci U S A 115:E5307-E5316
Pushkarev, Alina; Inoue, Keiichi; Larom, Shirley et al. (2018) A distinct abundant group of microbial rhodopsins discovered using functional metagenomics. Nature 558:595-599
Amarasinghe, Gaya K; Aréchiga Ceballos, Nidia G; Banyard, Ashley C et al. (2018) Taxonomy of the order Mononegavirales: update 2018. Arch Virol 163:2283-2294
Yutin, Natalya; Bäckström, Disa; Ettema, Thijs J G et al. (2018) Vast diversity of prokaryotic virus genomes encoding double jelly-roll major capsid proteins uncovered by genomic and metagenomic sequence analysis. Virol J 15:67
Ferrer, Manuel; Sorokin, Dimitry Y; Wolf, Yuri I et al. (2018) Proteomic Analysis of Methanonatronarchaeum thermophilum AMET1, a Representative of a Putative New Class of Euryarchaeota, ""Methanonatronarchaeia"". Genes (Basel) 9:
Galperin, Michael Y; Makarova, Kira S; Wolf, Yuri I et al. (2018) Phyletic Distribution and Lineage-Specific Domain Architectures of Archaeal Two-Component Signal Transduction Systems. J Bacteriol 200:
Sorokin, Dimitry Y; Makarova, Kira S; Abbas, Ben et al. (2017) Discovery of extremely halophilic, methyl-reducing euryarchaea provides insights into the evolutionary origin of methanogenesis. Nat Microbiol 2:17081

Showing the most recent 10 out of 196 publications