Apart from the double-helix B-DNA structure discovered by Watson and Crick, approximately 13% of the human genome comprises sequence motifs that can form non-canonical, or non-B, DNA conformations. This project focuses on G-quadruplexes, the type of non-B DNA for which we have the strongest evidence of genome-wide formation and functionality in human cells. There are more than 700,000 putative G-quadruplex loci in the human genome. They constitute ~1% of the genome, compared to ~1.5% occupied by protein-coding exons. Recent in vivo experiments showed that G-quadruplexes regulate key cellular processes (e.g., chromatin organization and transcription). Thus we hypothesize that some groups of G-quadruplex loci evolve under purifying selection. Yet, G-quadruplexes may represent a hurdle for DNA replication. Our published preliminary results, based on the analysis of long-read sequencing data, demonstrated decreased polymerization speed and increased polymerization errors at G-quadruplex loci genome-wide. We hypothesize that the same phenomena occur in human cells and lead to increased mutagenesis at G-quadruplex loci. Building upon our published and unpublished preliminary results, this project will examine the contribution of G-quadruplex motifs to genome evolution, which has been critically underexplored.
Aim 1 will elucidate the mechanistic basis behind the increased mutation rate at G-quadruplex loci, using state-of-the-art high-fidelity duplex sequencing. With in vivo experiments, we will test a hypothesis that mutation rates are increased specifically at G-quadruplex structures forming in human cells and are associated with replication slowdown. With in vitro experiments, we will test a hypothesis that two major eukaryotic replicative polymerases (polymerases epsilon and delta, responsible for leading and lagging strand synthesis, respectively) stall and have increased error frequencies at G-quadruplexes.
Aim 2 will assess the contribution of G-quadruplex loci to regional variation in mutation rates in the genome and will test a hypothesis that G-quadruplex loci facilitate structural variation in human populations and chromosomal rearrangements during evolution. Advanced statistical techniques, including ones from the Functional Data Analysis domain, will be used in this Aim. Finally, Aim 3 will examine selection acting on G-quadruplex loci using classical and novel statistical tests. We will test a hypothesis that G-quadruplexes located in different functional compartments of the genome experience varying selective pressures, e.g., promoter motifs are expected to evolve under strong purifying selection. Moreover, we will investigate a potential association between biophysical stability of G-quadruplex structures and the strength of selection acting on them.
This Aim will also identify groups of physiologically relevant G-quadruplex loci that will drive future functional studies. Overall, the project will substantially advance our understanding of the contribution of G-quadruplexes to genome evolution and diseases.

Public Health Relevance

In this project, we propose to study DNA sequences capable of forming non-canonical DNA conformations called G-quadruplexes. G-quadruplex formation is associated with several diseases, including Amyotrophic Lateral Sclerosis and cancer. Studying G-quadruplexes will advance our understanding of their role in disease etiology and will inform the development of personalized disease treatments.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
1R01GM136684-01A1
Application #
10120969
Study Section
Genetic Variation and Evolution Study Section (GVE)
Program Officer
Janes, Daniel E
Project Start
2021-01-01
Project End
2024-12-31
Budget Start
2021-01-01
Budget End
2021-12-31
Support Year
1
Fiscal Year
2021
Total Cost
Indirect Cost
Name
Pennsylvania State University
Department
Biology
Type
Schools of Arts and Sciences
DUNS #
003403953
City
University Park
State
PA
Country
United States
Zip Code
16802