Despite many studies on the mechanisms of DNA double-strand breaks (DSB) formation, our knowledge of them is very incomplete. To date, DSB formation has been extensively studied only at specific loci but remains largely unexplored at the genome-wide level. This is owing to the lack of systematic, genome-wide studies to objectively test and compare the proposed mechanisms of DSB formation, as well as the lack of high- resolution genome-wide maps of DSBs obtained by direct DSB labeling to validate them. Working with collaborators, we have recently developed a method to label DSBs in situ followed by deep sequencing (BLESS), and used it to map DSBs in human cells with a resolution 2-3 orders of magnitude better than previously achieved. Our results show that hypothesis-driven analysis of high-resolution genomic regions identified by BLESS can help explore the basis of genomic instability genome-wide. We discovered that DSBs happen most often in regions that form DNA secondary structures or are highly transcribed. Both may cause collapse of the replication fork, eventually leading to DSBs - the former via fork stalling on DNA secondary structures, the latter because of replication-transcription collisions (RTCs) or formation of RNA-DNA hybrids (R-loops). We therefore hypothesize that the majority of the observed DSBs can be attributed to at least one of three main, non-mutually exclusive endogenous causes: collapse of fork due to 1) stalling on DNA secondary structures or 2) RTCs, or 3) co-transcriptional R-loop formation. We will test this hypothesis and clarify the relative importance of these mechanisms by pursuing three Specific Aims: 1) Quantify how fork stalling on DNA secondary structures impacts DSB formation;2) Estimate the contribution of RTCs to DSB formation;and 3) Clarify the influence of R-loops on DSB formation. The work proposed in this application is primarily computational. The main innovation of this project lies in developing predictive models that will provide the first comprehensive evaluation of the contributions of fork stalling, RTCs and R-loop formation to genomic instability in various conditions in human cells. To construct such models - and to gather the data both to inform and verify them - we will combine several cutting-edge computational and molecular biology methods. The computational methods will be mostly adapted from theoretical physics and experimental methods will include DNA combing, ChIP-Seq and novel DRIP-Seq method for R-loops detection in addition to our BLESS method. We expect that our research will reveal a complex and nuanced picture of the mechanisms and context of DSB formation in human cells and move the field from studying individual examples of DSBs to achieving a systematic, genome-wide understanding of DSB formation mechanisms, and quantification of their relative importance. Such progress should eventually allow use of DSB localization signatures for diagnostic and prognostic purposes. We will also provide powerful software tools, experimental methods and rich datasets for future studies going beyond the DNA repair and replication fields.

Public Health Relevance

Francis Collins identified developing high-throughput technology as one of five areas of focus for NIH's research agenda, and one of the NHGRI's strategic goals is achieving maximal sequencing data accuracy as a prerequisite for clinical applications of sequencing methods. Our project addresses both of these goals: it takes a very promising, novel method for direct DSB detection in vivo and improves its accuracy through computational modeling, making it more useful for answering fundamental biological questions. This work will eventually enable the use of this method to generate high-resolution spatial genomic instability maps for diagnostic and prognostic purposes.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Modeling and Analysis of Biological Systems Study Section (MABS)
Program Officer
Lyster, Peter
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Texas Medical Br Galveston
Schools of Medicine
United States
Zip Code
Kuchta, Krzysztof; Towpik, Joanna; Biernacka, Anna et al. (2018) Predicting proteome dynamics using gene expression data. Sci Rep 8:13866
Clouaire, Thomas; Rocher, Vincent; Lashgari, Anahita et al. (2018) Comprehensive Mapping of Histone Modifications at DNA Double-Strand Breaks Deciphers Repair Pathway Chromatin Signatures. Mol Cell 72:250-262.e6
Biernacka, Anna; Zhu, Yingjie; Skrzypczak, Magdalena et al. (2018) i-BLESS is an ultra-sensitive method for detection of DNA double-strand breaks. Commun Biol 1:181
Aymard, Fran├žois; Aguirrebengoa, Marion; Guillou, Emmanuelle et al. (2017) Genome-wide mapping of long-range contacts unveils clustering of DNA double-strand breaks at damaged active genes. Nat Struct Mol Biol 24:353-361
Shi, Wei; Vu, Therese; Boucher, Didier et al. (2017) Ssb1 and Ssb2 cooperate to regulate mouse hematopoietic stem and progenitor cells by resolving replicative stress. Blood 129:2479-2492
Kudlicki, Andrzej S (2016) G-Quadruplexes Involving Both Strands of Genomic DNA Are Highly Abundant and Colocalize with Functional Sites in the Human Genome. PLoS One 11:e0146174
Fongang, Bernard; Kong, Fanping; Negi, Surendra et al. (2016) A Conserved Structural Signature of the Homeobox Coding DNA in HOX genes. Sci Rep 6:35415
Fongang, Bernard; Kudlicki, Andrzej (2016) Comparison between Timelines of Transcriptional Regulation in Mammals, Birds, and Teleost Fish Somitogenesis. PLoS One 11:e0155802
Mitra, Abhishek; Skrzypczak, Magdalena; Ginalski, Krzysztof et al. (2015) Strategies for achieving high sequencing accuracy for low diversity samples and avoiding sample bleeding using illumina platform. PLoS One 10:e0120520