Tools for mapping transgene insertion sites and associated expression levels are broadly useful but current approaches are experimentally cumbersome and have limited throughput. Here, we propose to develop a novel type of reporter construct, genome reporter interprocessing (GRiP), that makes it dramatically easier to randomly integrate reporter genes in a large number of genomic locations and subsequently map the insertion site. We will apply GRiP technology to integrate a library of alternatively spliced reporter constructs into hundreds of thousands of different positions, thus enabling us to understand how alternative splice isoform ratios vary between different genomic contexts. We will use the resulting data to improve existing computational tools for predicting the impact of variants on alternative splicing. GRiP builds on existing technologies and pathways, most importantly CRISPR/Cas9 genome editing and transcript cleavage and polyadenylation (CPA). We will use a guide RNA (gRNA) library and Cas9 endonuclease activity to create double stranded breaks in a large but defined set of genomic locations. A linear reporter cassette will be co-transfected with the gRNA library and, at some frequency, will be ligated into the break through the non-homologous end joining pathway. The reporter cassette consists of, at least, a promoter, coding sequence and a truncated 3?UTR that ends directly at the core signal recruiting the CPA machinery. To perform insertion mapping, we take advantage of a key feature of CPA, namely the ~17 nucleotide (nt) distance between the core signal and the position of transcript cleavage and polyadenylation. Because the core signal of the reporter cassette will be directly ligated to genomic DNA, transcription will run through the end of the cassette and cleavage will occur ~17 nt into the neighboring genomic sequence. As a result, each transcript?s GRiP site will carry a ~17 nt ?barcode? that reveals information about the site of integration. By sequencing the reporter transcripts and by combining the barcode information with information about the possible insertion sites (as determined by the chosen gRNAs), both location of insertion and transcriptional activity can be precisely mapped. While we here focus on initial technology development and on applying GRiP to investigate alternative splicing in the genome, we expect that this technology will find a wide range of additional applications including the precise mapping of double-stranded breaks generated during genome editing.
This research project aims to develop a novel class of highly compact gene expression reporters that can be integrated directly in the genome. Reporters will be used to understand the interplay between alternative splicing and chromatin. The resulting data will provide the basis for developing predictive models of alternative splicing and understanding the consequences of genetic variation in humans.