Alternative DNA secondary structures are mutagenic and prone to breakage, which can lead to genetic diseases and cancers. Formation of these structures can occur when the DNA duplex is unwound during metabolic DNA processes such as DNA replication, and cause abnormalities in these processes. With increasing recognition of the importance of DNA secondary structures in promoting gene rearrangements, it is timely and critical to carry out a bias-free assessment of the ability of the entire human genome sequence to form secondary structures, especially multiple stem-loop structures. To our knowledge, there is no such structural database available to the public. Such a database can serve as a basis for future studies, such as exploration of structure-function relationships of chromosome components, investigation of the influence of DNA structure on DNA metabolic processes, and the impact of environmental exposures on DNA fragility. In this proposal, we will first analyze the propensity to form DNA secondary structures in a genome-wide analysis, and use it to identify structural characteristics of fragile sites. The entire available human genome sequence will be evaluated for the potential to form multiple stem-loop structures, using the MFOLD program to create a structure database. This information will be used to directly examine whether the secondary structure-forming ability correlates with DNA fragility. Our analysis of chromosome 10 revealed exciting findings in which all fragile sites induced by aphidicolin displayed a higher propensity to fold into stable secondar structures compared to the rest of the chromosome. This work will also refine the current cytogenetically-defined large fragile sites, and define additional fragile sites in non-fragile regions. The goal is to compile a list of gene regions possessing high potential to fold into stabl secondary structures. These regions will be validated for secondary structure formation in vitro and for DNA breakage in cells, to directly test whether the propensity to form highly stable secondary structure is an underlying factor for DNA fragility. Then, to work towards clinical application of DNA fragility to a DNA diagnostic test, we will develop a high-throughput DNA breaksite mapping strategy to identify and quantitate breaksites within secondary structure-rich and translocation-participating gene regions. We have coupled ligation-mediated PCR breaksite mapping with massively parallel DNA sequencing using an Ion Torrent Personal Genome Machine. Finally, we will examine whether environmental and therapeutic agents generate DNA breaks at these secondary structure-rich and cancer-specific translocation-participating gene regions. These experiments will pave the way for the clinical application of using fragile site breakage in diagnostics. This proposal will generate useful tools for structural studies, address the nature of DNA fragility, and further advance our knowledge about the impact of environmental exposures in human disease development.
DNA fragility generated by alternative secondary structures is a known cause of many human diseases, and can be affected by nucleotide sequences, cellular activities, and environmental exposures. We propose to create a genome-wide DNA secondary structure database for compiling a panel of gene regions with a high potential to form stable secondary structures, to develop high-throughput breaksite mapping using next generation sequencing-based technologies, and to examine these regions for the effect of environmental exposures on fragility. This work has potential to inform future ways to measure DNA fragility in a personalized approach to diagnostics and treatment.