The recent completion of the human genome has provided the raw data on which to build a deep understanding of how the human organism functions. New ways to attack diseases and ensure health will result from knowledge of genome sequences. However, more than just the sequence of nucleotides is needed to understand how genetic information is retrieved and expressed. The three-dimensional structure of DNA is critical to the functions of the proteins that package the genome and regulate gene expression, yet no method exists that is capable of mapping DNA structure on a genomic scale. In this application, a new experimental technology is proposed for determining structural information for genomic DNA. This structural information will add a new dimension to the data that will be produced by the Encyclopedia of DNA Elements (ENCODE) Project. The experimental approach outlined in this application is aimed at the more than 90% of the human genome that does not code for proteins, where the information that is necessary for gene regulation resides. The method proposed here for making maps of genomic DNA structure makes use of a high-resolution chemical probe of DNA structure, the hydroxyl radical.
The Specific Aims of this application are: (1) To design and implement a database of hydroxyl radical DNA cleavage patterns. High throughput methods will be used to collect large amounts of experimental hydroxyl radical DNA cleavage data. DNA libraries and ENCODE region DNA sequences will be used for these experiments. (2) To develop methods to predict the hydroxyl radical cleavage pattern of any DNA sequence. The hydroxyl radical cleavage pattern database will be used to construct a model for how the structure of DNA depends on the sequence of nucleotides. This model will be used to predict the hydroxyl radical cleavage patterns of genomic DNA, including parts of the ENCODE regions of the human genome. (3) To make structural maps of genomic DNA. Since previous work has demonstrated that the hydroxyl radical cleavage pattern provides detailed information on the shape of the surface of DNA, experimental (and predicted) hydroxyl radical cleavage patterns will be used to make maps of cis-acting factor binding sites and sequences critical to the folding of chromatin. Computational tools, including hidden Markov models, will be trained on the database of hydroxyl radical cleavage patterns in order to recognize structural patterns in genomic DNA sequences. ? ?