Enhancers are short non-coding DNA elements which regulate the expression of genes. They are less conserved than promoters and genes, and their emergence is an important driver of the evolution of new phenotypic forms and functions. Mutations in non-coding regulatory sequences cause many human diseases, including inherited mendelian diseases and cancers. Enhancers often cluster together to confer additive regulatory specificity to a distal gene; however, it is not known if enhancers communicate and cooperate with other enhancers over genomic distance. The defining difference between enhancers and promoters is that one of the two divergent transcription start sites of promoters produces a messenger RNA molecule, while enhancers only produce non-coding transcripts which are often short and unstable. Precisely defining the grammar of enhancer activity and cooperativity will facilitate understanding of the basic biology and disease pathology of humans. Transposable elements are mobile DNA elements which make up ~50% of the human genome, represent a potent source of inter- and intra-species genetic diversity, and can act as enhancers by regulating adjacent host genes despite their mutagenic potential. By characterizing a cohort of newly evolved, TE-derived enhancers with high sequence similarity, we will observe a broad continuum of enhancer activity dictated by the genomic context and biochemical properties of each enhancer. LTR5HS is a subclass of the most recently endogenized retroviral element in the human genome, HERV-K, and contains elements which are polymorphic among humans. LTR5HS elements are transcriptionally active in the human embryo and act as enhancers controlling the expression of 275 genes in a human embryonic carcinoma-derived cell line. We will perturb these elements, both simultaneously and individually, and precisely define the impact of these perturbations on nascent transcription, 3D genome architecture, and phenotype. First, we will use the CARGO system to introduce of tens of gRNAs into cells, which in combination with the sequence similarity of LTR5HS elements enables targeting of dCas9 fused to activating or repressing domains to most of the 697 LTR5HS elements in the human genome. On a temporal axis, we will monitor nascent transcription and long-range contacts of the elements post-perturbation, and infer the mode of activity and cooperation of subnetworks depending on the timing and magnitude of these measurements. Second, we will create a pooled deletion library of all LTR5HS elements and measure the impact of each on nascent transcription using single-cell technology to capture gRNA sequence along with nascent RNA from single cells, and select against gRNAs deleterious for growth, differentiation, and pluripotency phenotypes. This highly granular approach will provide a framework for relating the biochemical properties and genomic context of enhancers to transcriptomic and phenotypic properties.
Non-coding regulatory DNA sequences, including enhancers, are implicated in a myriad of human diseases including developmental disorders and cancers. Expanding our ability to translate the biochemical properties and genomic context of enhancers to their phenotypic and transcriptomic functions will greatly improve our understanding of how an individual?s genome sequence relates to their disease pathology. Novel mechanistic insights regarding enhancer function derived from this work will facilitate improvements in the precision and capability of genomic medicine.