Knowledge of the patterns and rates of background substitution is essential for the identification and analysis of functional sequences in the human genome. Provided this knowledge it should be possible to identify functional sequences in comparison of two or more genomes. Such sequences will stand out as those that have changed either significantly less or significantly more than expected under the estimated rates of background substitution. Despite this central importance, patterns of background substitution are poorly known. Questions of whether rates of background substitution vary across the human genome and whether these rates have been evolving in the human lineage remain controversial. Major difficulties lie in identifying sequences that evolve under no functional constraint and also in devising methods of inferences given the existence of a rapid neighbor-dependent CpG to TpG/CpA transition prevalent in mammalian DNA. The high rate and the neighbor-dependence of this process substantially complicate all inferences of substitution, even those of single-nucleotide substitutions at non-CpG sites. This project will utilize a new maximum likelihood method capable of simultaneous inference of the rates of CpG to TpG/CpA transition and of the rates of single-nucleotide substitution significantly beyond the point of naive saturation. This method will be applied to the abundant sequences of dead copies of transposable elements in the human and other mammalian genomes deposited over the last 200-300 million years. This analysis will provide essential new information regarding the evolution of patterns of substitution in mammalian genomes, will create fine-scale (1-5 Mbp) genomic maps of substitution patterns and rates of the human and other mammalian genomes, and will investigate genomic determinants of background substitution patterns in mammals.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Genetic Variation and Evolution Study Section (GVE)
Program Officer
Eckstrand, Irene A
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Stanford University
Schools of Arts and Sciences
United States
Zip Code
Markova-Raina, Penka; Petrov, Dmitri (2011) High sensitivity to aligner and high rate of false positives in the estimates of positive selection in the 12 Drosophila genomes. Genome Res 21:863-74
Petrov, Dmitri A; Fiston-Lavier, Anna-Sophie; Lipatov, Mikhail et al. (2011) Population genomics of transposable elements in Drosophila melanogaster. Mol Biol Evol 28:1633-44
Sellis, Diamantis; Callahan, Benjamin J; Petrov, Dmitri A et al. (2011) Heterozygote advantage as a natural consequence of adaptation in diploids. Proc Natl Acad Sci U S A 108:20666-71
Lawrie, David S; Petrov, Dmitri A; Messer, Philipp W (2011) Faster than neutral evolution of constrained sequences: the complex interplay of mutational biases and weak selection. Genome Biol Evol 3:383-95
Hershberg, Ruth; Petrov, Dmitri A (2010) Evidence that mutation is universally biased towards AT in bacteria. PLoS Genet 6:e1001115
González, Josefa; Karasov, Talia L; Messer, Philipp W et al. (2010) Genome-wide patterns of adaptation to temperate environments associated with transposable elements in Drosophila. PLoS Genet 6:e1000905
Cai, James J; Borenstein, Elhanan; Petrov, Dmitri A (2010) Broker genes in human disease. Genome Biol Evol 2:815-25
Hershberg, Ruth; Petrov, Dmitri A (2009) General rules for optimal codon choice. PLoS Genet 5:e1000556
Li, Victor C; Davis, Jerel C; Lenkov, Kapa et al. (2009) Molecular evolution of the testis TAFs of Drosophila. Mol Biol Evol 26:1103-16
Gonzalez, Josefa; Petrov, Dmitri (2009) Genetics. MITEs--the ultimate parasites. Science 325:1352-3

Showing the most recent 10 out of 21 publications