Microsatellite sequences are abundant in the human genome and have mutation rates orders of magnitude higher than any other genomic sequences. As a result, microsatellites are frequently used as markers in forensics and population genetics. Importantly, microsatellites influence genome functions by being part of protein-coding regions or by regulating gene expression, and allele-length polymorphisms at microsatellites are implicated as genetic risk factors in several diseases. Because the full impact of microsatellite changes on genome function has yet to be elucidated, it is of utmost importance to gain knowledge about how microsatellite arise, mutate, and eventually cease to exist at individual loci in the human genome. The evolution of each microsatellite has been presented theoretically as a life cycle, with the stages of birth, active dynamic mutation activity, and death. However, the concept of the microsatellite life cycle has not been previously investigated in detail. The goal of this interdisciplinary proposal is to elucidate mechanisms defining microsatellite life cycle in the human genome. This will be accomplished by a combination of computational and biochemical approaches, and follows the NIH roadmap themes of Interdisciplinary research and Bioinformatics and computational biology.
Specific Aim 1 is to determine the mechanisms of microsatellite birth. We will use biochemical experiments to determine the microsatellite threshold in terms of the minimal number of repeats (or length) required for dynamic mutations to occur. These thresholds will be determined for various motifs, and will be used in computational analyses to examine mechanisms and densities of new microsatellite births. The results of this aim will allow us for the first time to derive a regression model explaining variation in microsatellite birth densities across the genome.
Specific Aim 2 will examine microsatellite interruption and death. Our preliminary studies demonstrate that microsatellite interruptions can be observed frequently in the human genome, and that DNA polymerases can directly produce such interruptions in vitro.
This aim will use computational and biochemical techniques to measure the mutational consequences of interruptions and the extent to which they contribute to microsatellite death.
Specific Aim 3 is to computationally determine the mechanisms contributing to variation in mature microsatellite mutation rates among and within individual human genomes, and to biochemically determine specific mechanisms contributed by intrinsic features. Overall, the results of this project will be of considerable significance for our understanding of the dynamics of genome evolution. Additionally, our research proposal has direct relevance to the issues of public health and clinical genetics. The new information gained by our research can be used to predict the probability of each microsatellite to undergo mutation or cease to exist, and the probability of any genomic region to bear a new microsatellite. This will have major importance for assessing an individual's disease risks, especially in the era when individual human genomes are being rapidly sequenced.

Public Health Relevance

Repetitive DNA sequences, called microsatellites, are characteristic of primate genomes and are known to regulate gene expression, and mutations within microsatellite sequences are causally linked to the development of several human diseases. Our interdisciplinary project will elucidate the mechanisms whereby microsatellites arise, mutate, and disappear at distinct loci in individual human genomes. This research could have major consequences for predicting the risk of diseases caused by microsatellites.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Genetic Variation and Evolution Study Section (GVE)
Program Officer
Krasnewich, Donna M
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Pennsylvania State University
Schools of Arts and Sciences
University Park
United States
Zip Code
Barnes, Ryan P; Hile, Suzanne E; Lee, Marietta Y et al. (2017) DNA polymerases eta and kappa exchange with the polymerase delta holoenzyme to complete common fragile site synthesis. DNA Repair (Amst) 57:1-11
Fungtammasan, Arkarachai; Tomaszkiewicz, Marta; Campos-Sánchez, Rebeca et al. (2016) Reverse Transcription Errors and RNA-DNA Differences at Short Tandem Repeats. Mol Biol Evol 33:2744-58
Baptiste, Beverly A; Jacob, Kimberly D; Eckert, Kristin A (2015) Genetic evidence that both dNTP-stabilized and strand slippage mechanisms may dictate DNA polymerase errors within mononucleotide microsatellites. DNA Repair (Amst) 29:91-100
Makova, Kateryna D; Hardison, Ross C (2015) The effects of chromatin organization on variation in mutation rates in the genome. Nat Rev Genet 16:213-23
Fungtammasan, Arkarachai; Ananda, Guruprasad; Hile, Suzanne E et al. (2015) Accurate typing of short tandem repeats from genome-wide sequencing data and its applications. Genome Res 25:736-49
Ananda, Guruprasad; Hile, Suzanne E; Breski, Amanda et al. (2014) Microsatellite interruptions stabilize primate genomes and exist as population-specific single nucleotide polymorphisms within individual human genomes. PLoS Genet 10:e1004498
Campos-Sánchez, Rebeca; Kapusta, Aurélie; Feschotte, Cédric et al. (2014) Genomic landscape of human, bat, and ex vivo DNA transposon integrations. Mol Biol Evol 31:1816-32
Kuruppumullage Don, Prabhani; Ananda, Guruprasad; Chiaromonte, Francesca et al. (2013) Segmenting the human genome based on states of neutral genetic divergence. Proc Natl Acad Sci U S A 110:14699-704
Hile, Suzanne E; Shabashev, Samion; Eckert, Kristin A (2013) Tumor-specific microsatellite instability: do distinct mechanisms underlie the MSI-L and EMAST phenotypes? Mutat Res 743-744:67-77
Montgomery, Stephen B; Goode, David L; Kvikstad, Erika et al. (2013) The origin, evolution, and functional impact of short insertion-deletion variants identified in 179 human genomes. Genome Res 23:749-61

Showing the most recent 10 out of 20 publications