Efforts to map the functional elements of the human genome, which include elements such as insulators, enhancers, promoters and transcriptional start sites, have historically treated the genome as linear. However, it is now well appreciated that the genome has a three-dimensional (3D) organization that is important for mediating functional associations between regulatory elements and gene-coding sequences. Thus a linear map provides an incomplete picture of the genome, and it is often difficult or impossible to infer functional associations between regulatory elements without a topological framework to provide context. We have developed and advanced a powerful, high-resolution method for providing such a topological framework, Chromatin Interaction Analysis using Paired-End Tag sequencing (ChIA-PET). Using ChIA-PET, we have demonstrated that specific DNA motifs bound by CCCTC-binding Factor (CTCF) are critical in defining topological domains and arranging the gene positions for coordinated transcription mediated by RNA Polymerase II (RNAPII). Therefore, the combination of CTCF and RNAPII ChIA-PET will be effective for comprehensively mapping the major structure codes and topological organization that scaffold RNAPII- associated transcriptional regulation. We will contribute ChIA-PET technology to the ENCODE Project to both strengthen existing ENCODE datasets and identify new ?structure code? elements and their interplays with gene-coding sequences that will aid in understanding the transcriptional landscape of the genome. We have established a robust ChIA-PET pipeline from library production to data processing for human and mouse cells. Here, we propose to apply this platform to assay large numbers of cell lines and primary cells that represent a wide-range of cellular space with important biological significance. Based on our current production scale and estimated budget allocation, we aim to produce 1024 high quality datasets from CTCF and RNAPII ChIA-PET experiments, each with two biological replicates for 256 biological samples. This pipeline capacity will be applied to samples selected by the ENCODE Consortium, and to this sample pool we aim to contribute a collection of high value biological samples that are likely of common interest to both the Consortium and the greater research community. These samples include both primary and in vitro?differentiated human blood cells, healthy and diseased induced pluripotent stem cells (iPSC), mature neurons differentiated from the iPSCs, and several major cell and tissue types from healthy and disease-model mice. These samples were selected to expand the ?cell space? explored by the ENCODE Project and also because ChIA-PET analyses will be particularly relevant for revealing fundamental biology.
The ENCODE Project aims to create a comprehensive catalog of all of the functional elements in the human genome, which will advance our understanding of genome function and the genetic basis of disease. These functional elements are currently mapped in a linear manner, which does not account for the known three- dimensional genome structures that are critical for proper function. We propose to provide a three dimensional context to all previously identified and new elements through long-range chromatin interaction mapping, greatly increasing the ability of the research community to interpret the manner in which elements affect genome function. This will in turn aid in our understanding of how mutations in functional elements cause disease.
|Li, Xingwang; Luo, Oscar Junhong; Wang, Ping et al. (2017) Long-read ChIA-PET for base-pair-resolution mapping of haplotype-specific chromatin interactions. Nat Protoc 12:899-915|