This project will combine molecular and computational biology to study the evolution of diversity in gene expression. In all organisms, gene expression is controlled, or regulated, by crosstalk between proteins called transcription factors and specific small regions in the DNA that bind the transcription factors (transcription factor binding sites.) A key to understanding this crosstalk is learning the precise nature of the "regulatory vocabulary" of transcription factor binding sites which potentially varies among organisms. This project will focus on deciphering the regulatory language in multiple species of the single-celled protozoan Paramecium, a well-established model system in cell biology. The overall project will advance the formal training of undergraduates, graduate students and postdoctoral fellows, while also establishing a set of molecular resources for the international research community. Specific efforts are being made to organize and integrate all genomic and gene-expression results into a web repository that will allow users to readily query data for dozens of Paramecium species to acquire information about status and evolutionary history of all known genes. The scientific and training results from this project will help promote the growth of the field of evolutionary cell biology, thereby broadening our understanding of the diversity of cellular functions across the Tree of Life.

Much of biological diversity owes its origin to gene duplication and modifications in gene regulatory patterns. The Paramecium system is well-suited to studying the impacts of gene duplication because the evolution of this genus included two complete-genome duplications that led to the establishment of a clade of distantly related but nearly morphologically identical species. This situation is of broad interest because two rounds of genome duplication are also thought to have preceded the emergence and diversification of the major vertebrate lineages. With complete sequences now available for most Paramecium species, and the evolutionary history of each gene known, the next goal is to determine the cellular mechanisms responsible for differential gene expression and the evolutionary mechanisms associated with differential gene survival. Using a set of efficient methods for discovering transcription factor binding sites, this project will elucidate the evolutionary history of the gene-regulation vocabulary over the billion-year history of the genus Paramecium. Comparisons among lineages that did or did not undergo complete-genome duplications will help define the extent to which evolutionary dynamics are altered in the face of massive gene duplication. Paramecium shares some key features with animals, such as harboring a transcriptionally silent germline genome as well as an active somatic nucleus, all within the confines of a single, highly complex cell. Thus, results of these studies in the Paramecium, which is exceptionally easy to manipulate, may shed light on the evolution of regulatory diversity in other organisms that are not as experimentally tractable.

Agency
National Science Foundation (NSF)
Institute
Division of Molecular and Cellular Biosciences (MCB)
Application #
1518060
Program Officer
Karen Cone
Project Start
Project End
Budget Start
2015-08-15
Budget End
2018-05-31
Support Year
Fiscal Year
2015
Total Cost
$1,028,818
Indirect Cost
Name
Indiana University
Department
Type
DUNS #
City
Bloomington
State
IN
Country
United States
Zip Code
47401