Regulation of gene expression is crucial for proper development. Our studies aim to understand the mechanism of epigenetic inheritance of gene expression and uncover unrecognized ways to disrupt gene expression that may contribute to developmental defects in humans. Polycomb group (PcG) and the Trithorax group (TrxG) proteins are important for epigenetic inheritance of the silenced and the active chromatin state, respectively. They are present in all metazoans. In Drosophila, regulatory elements called Polycomb group response elements (PREs) are required for the recruitment of chromatin-modifying PcG protein complexes to the DNA. TrxG proteins act through either the same or overlapping cis-acting sequences as PcG proteins. Our group is working on understanding how PcG and TrxG proteins get recruited to the DNA. Another goal of our research is to understand how PREs interact with other regulatory elements present within a locus. During the past year, we have made two important discoveries described below. Combgap is a PRE-DNA binding protein required for a subset of PREs (1). PREs are made up of binding sites for many different DNA binding proteins some identified and some unknown. Identifying all the proteins involved in PcG protein recruitment is a necessary step in understanding how recruitment occurs. Using a DNA-affinity pull-down coupled to mass spectrometry assay, we identified the 11-zinc finger DNA-binding protein Combgap (Cg) as a candidate PRE-DNA binding protein. cg mutants were previously studied in Drosophila and it was proposed that Cg protein could act as either a transcriptional repressor or activator dependent on the context. Using an antibody to Cg, chromatin-immunoprecipitation followed by next generation sequencing (ChIP-seq) showed that Cg binds to PREs among other places in the genome. Further, our data showed that Cg binding overlapped extensively with the PcG protein Polyhomeotic (Ph). Many PcG proteins are present in protein complexes; the two most studied are called Polycomb repressive complex 1 (PRC1) and Polycomb repressive complex 2 (PRC2). Ph is included in the PRC1 protein complex. The PRC2 protein complex tri-methylates histone H3 on lysine 27, putting the H3K27me3 repressive mark on the chromatin, the hallmark of PcG activity. PRC1 and PRC2 are both bound to PREs in H3K27me3 domains of the genome. It is not known exactly how PRC1 and PRC2 get recruited to the DNA or whether the same mechanism is used in all cases. There are several important conclusions to be made from our data on Combgap. First, Combgap binds directly to the sequence GTGT, a motif previously identified as important for PRE activity. Second, our ChIP-seq data showed that Cg overlaps Ph binding both within H3K27me3 domains, and in other regions of the genome. These data support the emerging view that PRC1 and PRC2 can bind independently of each other. Third, our data strongly suggests that Cg can recruit Ph to the DNA in the absence of other PRC1 components. This suggests that Ph can act by itself or in an unidentified protein complex. Finally, through analysis of Ph binding in cg mutants we show that only a subset of Ph binding is lost in the absence of the Cg protein. This shows that not all PREs are identical and illustrates the redundancy of PcG recruitment to individual PREs. Formation of a Polycomb domain in the absence of strong Polycomb response elements (2). PREs have been identified as isolated elements in transgenic assays and as strong peaks of PcG protein binding in ChIP-seq experiments. The goal of our experiments was to test the function of PREs within a PcG target gene in the endogenous gene. The invected (inv) and engrailed (en) genes are important developmental transcription factors that are located next to each other in the genome, share regulatory DNA, are PcG-regulated, and encompass a 113kb H3K27me3 domain. There are two well characterized PREs located upstream of the en transcription start site, and two located in the inv gene. ChIP-seq experiments show the presence of strong PcG-protein peaks located at these well characterized PREs. In addition to these strong peaks, weaker, but still statistically significant PcG-ChIP-seq peaks are also present in this domain. Some of these weak peaks are stage specific, while the strong peaks are present in all cell types and at all developmental stages. The goal of our study was to test whether the weak peaks were sufficient to drive the formation of a H327me3 domain in the absence of the strong peaks. Surprisingly, in vivo deletion of the four characterized strong PREs from the PcG regulated invected-engrailed (inv-en) gene complex did not disrupt the formation of the H3K27me3 domain and did not affect inv-en expression in embryos or larvae suggesting the presence of redundant PcG recruitment mechanism. Further, the 3D-structure of the inv-en domain was only minimally altered by the deletion of the major PREs. Our data also showed that a transgene containing three weak peaks was able to establish a small H3K27me3 domain and was PcG-regulated. We conclude that there are many PREs within the inv/en domain and that these PREs are, at least to some extent, functionally redundant. This data has important implications for understanding PcG recruitment. PREs are easily recognized in Drosophila as large PcG-binding peaks in ChIP-seq data and there are good assays for PRE activity. In contrast, in mammals, large PcG-protein peaks do not exist in ChIP-seq data and PREs have been hard to identify. Functional studies show there are many PREs located in mammalian genes. Our data suggest that Drosophila genes are similar to mammalian genes in that they have lots of weak PREs located throughout the H3K27me3 domain. We suggest that these weak PREs must be present within a locus to spread the H3K27me3 mark throughout a 113kb domain. Experiments are on-going in the lab to test this hypothesis.
Showing the most recent 10 out of 23 publications