Transfer RNAs (tRNAs) are fundamental to life due to their vital roles in protein translation. tRNAs are needed in large abundance for normal cellular functions, which requires that they are among the most highly transcribed loci in the genome. Because of their rigidly defined structure and interactions with many other molecules, each base within tRNA genes is highly conserved. Pre-tRNA transcripts also include leader and trailer sequences, which have minimal functionality in most cases and are quickly processed out as part of the tRNA maturation process. Nonetheless, their high levels of transcription can lead to extremely high levels of variation, likely due to transcription-associated mutagenesis (TAM). TAM at tRNA loci has important implications. tRNA genes exist in many copies throughout the genome, and while many of these genes are constitutively transcribed, epigenomic data shows that a majority appear to be completely inactive. Variation in tRNA gene expression within and between species make annotation of expression essential for interpreting the potential effects of natural variants in populations. Greater understanding of tRNA locus variation could enable prioritization of risk loci in genome-wide association studies, as variants in active tRNA genes could have pronounced fitness consequences. However, annotation of tRNA expression levels is difficult for many reasons, including post-transcriptional modifications that impede RNA sequencing, as well as their high levels of redundancy at the gene level. I will develop a predictive classifier, which will use only DNA sequence data, to infer and annotate tRNA gene expression across mammals. There are strong evolutionary implications of increased tRNA transcription as well. Virtually all eukaryotic genomes contain upwards of 200 tRNA genes. Theory predicts that duplicated genes will quickly diverge in function and sequence, generally by neo- or sub-functionalization. However, these predictions assume low and equal germline mutation rates among genes. Therefore, elevated mutation rates at tRNA loci may drive the conservation of hundreds of functionally redundant genes. I will develop an individual-based population genetic simulator framework, using estimations of the per-locus mutation rates at tRNA genes, as well as their duplication and deletion rates. I will then compare simulation results to the actual human tRNA distribution to quantitatively test each component of this model. Adding additional complexity, modifications to the wobble base position on mature tRNAs often alters tRNAs? decoding repertoire and are essential for proper translation. Differences in tRNA modification between species may lead to differences in wobble potential, and thus change codon usage bias. For example, several closely related Drosophila species exhibit drastic shifts in codon preference despite no changes in tRNA gene copy number. To investigate the evolutionary influence of anticodon base pairing, I will analyze the relative effects of modification enzymes and determine their effects on codon preference shifts, using Drosophila as a model.
Transfer RNA (tRNA) genes are among the most essential and highly transcribed genes, and are likely subject to high mutation rates. This has a currently unexplored and potentially profound impact on human disease. This proposal aims to model observed genetic variation in humans and other mammals to predict the relative importance of each human tRNA gene in disease.