Intellectual Merit. With the ability to completely characterize the genomes of closely related species and individuals within species, it is now possible to elucidate the mechanisms by which evolution proceeds at the molecular level, via both the promotion of adaptations within species and the establishment of new species. This project involves a comparative survey of the genome sequences of the complete set of cryptic species of the Paramecium aurelia assemblage of ciliated protozoans. Just prior to the radiation of this complex, the common ancestor experienced a complete doubling of the nuclear genome, and preliminary evidence suggests that silencing of alternative redundant gene copies in sister lineages has led to map changes that may operate as effective reproductive isolating barriers. The relatively young age of the complex, combined with its large number of constituent species and relatively simple genomic architecture, provides a powerful and unprecedented resource for understanding the roles that gene duplication plays in the generation of biodiversity. By establishing the complete history of all ancestral gene copies over a finely dissected phylogeny, the patterns of preservation vs. demise of various functional classes of duplicate genes will be evaluated. The analyses will also reveal the temporal patterns of gene loss that eventually lead to the acquisition of new equilibrium genomic states in the descendant taxa, as well as clarify the extent to which gene resurrections occur. With the inclusion of information on gene expression, several key hypotheses on the evolution of duplicate genes will be tested. Lending an exceptional level of power to the analyses is the availability of information on the rate and complete molecular spectrum of mutations for two aurelia species. This provides a formal basis for deciphering the forces of evolution operating on duplicate genes by providing a null model for the fates of genes in the absence of selection (e.g., positive selection for preservation or active promotion of gene loss by mutational degradation). As the first study of this sort in a natural assemblage of unicellular eukaryotes, this project has the potential to greatly expand our understanding the mechanisms of genome evolution, providing a complement to the much richer set of observations on multicellular species.
Broader Impacts. The data generated by this study will serve as a critical and permanent resource for the Paramecium genetics community, while also providing the first detailed data on the dynamics of duplicate genes on an evolutionarily interpretable time scale. The data will be organized into the existing ParameciumDB web-based data system, allowing users to readily query the entire species assemblage for the status and evolutionary history of the full set of paralogous genes back to the ancestor of the P. aurelia complex. In addition, the database will be integrated into a community-level effort at incorporating ciliates in classroom research. Finally, the project will support the training of a graduate student and a postdoctoral fellow, with a goal of establishing them as leaders in the re-emerging field of Paramecium evolutionary genetics.