Somatic mutations are a non-heritable class of mutations that have the potential to be highly detrimental to the survival of an organism. In particular, somatic mutations that modify DNA fidelity, cellular growth rates, or immune response evasion can generate malignant cells (tumors) that can lead to a reduction in fecundity or in extreme cases, lethal. Yet despite the influence that somatic mutations can have on the health of multicellular organisms, very little is known about the rate and spectrum of somatic mutations. Because somatic mutations arise at a very low frequency per generation and are non-heritable, they remain difficult to assay using standard techniques, even with the recent advances in high- throughput sequencing technology. The primary objective of this project is to design a maximum-likelihood (ML) method that can overcome the issues that arise when surveying low frequency mutations in high-throughput sequencing. Recent work has shown that maximum-likelihood methods are an unbiased estimator of heterozygosity and rare alleles in low coverage genome projects. This work can be expanded to estimate somatic mutation rates from high-throughput sequencing of any tissue with a known cell genealogy. I propose to test this model using Monte Carlo simulation of somatic mutations that arise in a cell genealogy with a variable number of parameters. After these benchmarks, the ML framework can be applied to high-mutator cell lineages, eukaryotic tissue, and cancer genomes to determine the ability of this framework to estimate somatic mutation rates. This project will provide a deeper understanding of somatic mutation rate and spectrum. In addition, the results from this project will be relevant to a broad array of scientific and medical sciences, including the study of cell aging, tumor development, the boundaries of mutation rate. Furthermore, the models developed here can determine the cost-benefit tradeoff of high- throughput sequencing in somatic mutation assays and has potential to greatly reduce the enormous costs that may come from individual tumor sequencing in the future.

Public Health Relevance

Somatic mutations are changes to the DNA that can occur in any non-heritable cell in the body, and depending on where and when they occur, they can generate cancerous cells (tumors). The goal of this project is to apply a maximum-likelihood method to determine the rate and spectrum of somatic mutations. By understanding how somatic mutations evolve, we can begin to understand tumor development, cell aging, and eukaryotic evolution.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Postdoctoral Individual National Research Service Award (F32)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-F08-Q (20))
Program Officer
Reddy, Michael K
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Indiana University Bloomington
Schools of Arts and Sciences
United States
Zip Code
Morton, Elise R; Merritt, Peter M; Bever, James D et al. (2013) Large deletions in the pAtC58 megaplasmid of Agrobacterium tumefaciens can confer reduced carriage cost and increased expression of virulence genes. Genome Biol Evol 5:1353-64
Bik, Holly M; Fournier, David; Sung, Way et al. (2013) Intra-genomic variation in the ribosomal repeats of nematodes. PLoS One 8:e78230