This project is aimed at characterizing the various influences on the composition of the genomes of important eukaryotic model organisms (yeast, Caenorhabditis elegans and Drosophila melanogaster). A genome can be viewed as a sequential arrangement of nucleotides, each of which can be replaced (by mutation) with any of the other nucleotides over the course of evolution. Depending on environmental and genetic context, mutations will be harmful, beneficial or neutral. Those mutations that do affect fitness (the capacity of the individual to contribute to future generations) are subject to selection, while all mutations are subject to mutational biases. Prior analyses, particularly on fruit flies and warm-blooded vertebrates have shown that base composition (the relative usage of the four nucleotides) varies across the genome. In fruit flies, there are several nested levels of compositional variation. Some compositional variation correlates with synonymous codon usage, which is subject to selection in many organisms on the basis of its influence on protein synthesis. A major limitation in the past for studies on codon usage and base composition has been the number of confidently sequenced genes. However, as the genome sequences of model organisms are largely complete, we can construct data sets of several thousand genes to better resolve the patterns of base composition variation. We can also better describe codon usage bias, and estimate the relative influences of natural selection and regional variation in compositional bias. It is in view of this that we propose to analyze compositional and codon biases in selected eukaryotes using relevant statistical methods that have been developed by the principal investigator, as well as new methods that will be developed as part of the project. A fuller understanding of DNA sequence evolution in model organisms will provide a useful contrast for future analyses on the completed human genome sequence. Furthermore, this project will provide a set of statistical tools, as well as computer programs, for compositional analysis of very large DNA sequence data sets.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Academic Research Enhancement Awards (AREA) (R15)
Project #
7R15HG002456-02
Application #
6609838
Study Section
Genetics Study Section (GEN)
Program Officer
Good, Peter J
Project Start
2001-09-30
Project End
2004-09-29
Budget Start
2002-07-26
Budget End
2004-09-29
Support Year
2
Fiscal Year
2001
Total Cost
$76,730
Indirect Cost
Name
Cedar Crest College
Department
Biology
Type
Schools of Arts and Sciences
DUNS #
City
Allentown
State
PA
Country
United States
Zip Code
18104
McDermott, Shannon R; Kliman, Richard M (2008) Estimation of isolation times of the island species in the Drosophila simulans complex from multilocus DNA sequence data. PLoS One 3:e2442
Llopart, Ana; Mabille, Aelen; Peters-Hall, Jennifer R et al. (2008) A new test for selection applied to codon usage in Drosophila simulans and D. mauritiana. J Mol Evol 66:224-31
Cirulli, Elizabeth T; Kliman, Richard M; Noor, Mohamed A F (2007) Fine-scale crossover rate heterogeneity in Drosophila pseudoobscura. J Mol Evol 64:129-35
Kliman, Richard M; Bernal, Cheryl A (2005) Unusual usage of AGG and TTG codons in humans and their viruses. Gene 352:92-9
Kliman, Richard M; Hey, Jody (2003) Hill-Robertson interference in Drosophila melanogaster: reply to Marais, Mouchiroud and Duret. Genet Res 81:89-90
Kliman, Richard M; Irving, Naheelah; Santiago, Maria (2003) Selection conflicts, gene expression, and codon usage trends in yeast. J Mol Evol 57:98-109
Hey, Jody; Kliman, Richard M (2002) Interactions between natural selection, recombination and gene density in the genes of Drosophila. Genetics 160:595-608