This project primarily involves empirical analyses to better understand the forces that influence synonymous codon usage in Drosophila, Cryptococcus and human papillomaviruses.
The aims of the Drosophila project are (1) to infer the action of natural selection on codon usage of individual amino acids; (2) to infer the differential action of natural section on codon usage along the length of genes; and (3) to estimate selection coefficients for synonymous mutations.
The second aim i s a follow-up to analyses that indicated a genome-wide preferred codon usage peak approximately one hundred codons downstream from the start codon. Inference of selection will be based on levels and patterns of DNA sequence polymorphism and divergence. A data set of twenty protein-coding genes, sequenced from start through stop codon, is currently being generated for four strains of Drosophila simulans, five strains of D. mauritiana and one strain of D. sechellia. Based on observed levels of DNA sequence variation, another twenty five genes, at a minimum, will be added to this data set to provide sufficient statistical power to address the aims.
The aim of the Cryptococcus project is to test the hypothesis that preferred codon usage reflects selection for efficient expression of gene products. Cryptococcus, a single-celled fungus and an opportunistic human pathogen, has 185 alternatively spliced genes. Nineteen of these have been chosen as suitable candidates for the study, which will compare expression of alternative transcripts with the codon bias of the regions unique to each transcript. It is predicted that expression will correlate positively with the usage of preferred codons, which have been inferred by multivariate statistical analysis. Expression will be measured by quantitative real-time PCR, and confirmed by cloning and molecular analysis of cDNAs. Once this project is completed, a related project on Drosophila will be piloted.
The aim of the human papillomavirus project is to confirm a preliminary finding that the genomes of these viruses are not at equilibrium with respect to base composition and codon usage. Preliminary data indicate that human papillomaviruses, among the most GC-poor of the human-infecting viruses, are in the process of increasing their G+C contents. This may indicate adaptation to the human genome, which is much richer in G and C than HPVs. Analysis of human-infecting viruses also provides insight into the human DNA replication and gene expression machinery, on which HPVs rely. This project addresses population and evolutionary genetics questions and will lead to new analytical methods. Additionally, two organisms of special concern to human health will be studied: the opportunistic pathogen Cryptococcus neoformans and human papillomaviruses. ? ? ?