The DNA sequence of a gene needs to be copied into mRNA molecules to carry out its function, a process called transcription. Cells are highly selective in where they begin (initiate) transcription of their genes. The mechanism of locating the starting positions of transcription is shared by most eukaryotic organisms such as humans, plants, and fungi. Yet, a distinct way of transcription initiation has been found in a few fungal species including Baker’s yeast. This project aims to determine when and how the mechanisms of transcription initiation have diverged. Success of this project will advance basic understandings of the evolution and genetic mechanisms of transcription initiation, a first step in harnessing the power of biology for the bioeconomy. This project also develops new software and generates a large collection of mRNA sequences, which will benefit the research community. In addition, this project will build a user-friendly database that can serve as a portal for improving public literacy in genomics and facilitate teaching in K-12 schools and colleges. This project also contributes to the development of a globally competitive STEM workforce by training a postdoctoral scholar, graduate, undergraduate, and high school students.
In most eukaryotes, transcription is initiated by a large enzyme preinitiation complex at about 30 nucleotides downstream of a short sequence element known as the TATA box, where the preinitiation complex is assembled. In contrast, the preinitiation complex scans DNA sequences further downstream of the TATA box for favorable transcription start sites in Saccharomyces cerevisiae. This project aims to elucidate when the transcription initiation mechanisms have diverged and to identify key genetic changes associated with this process. This project applies a combination of computational and experimental approaches to determine the initiation mechanism of 68 fungal species. Aim 1 of this project will develop novel bioinformatics tools for accurate mapping of transcription start sites based on high throughput sequencing data. In Aim 2, this project will capture the first nucleotide of RNA molecules and sequence these transcripts to generate transcription start site maps for 68 species representing all major fungal lineages. The initiation mechanism for each species will be determined based on the distribution of the TATA box relative to the transcription start site on a genomic scale. A detailed evolutionary history of initiation mechanisms will be depicted to pinpoint the occurrence of divergence. The third project aim will identify the key genes associated with the divergence of transcription initiation mechanisms based on their evolutionary history and expression profiles. Important genetic changes in these genes are also expected to be determined through comparative sequence analyses. These results will lay a solid foundation for future interrogation of the specific functions of preinitiation complex components in transcription initiation.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.