We propose to develop a flexible and highly multiplexed reporter assay that will enable high-throughput, quantitative screening and dissection of the millions of non-coding functional elements uncovered by ENCODE. The assay relies on generating and co- delivering pools of reporter constructs where each DNA sequence of interest is linked to a synthetic reporter gene that carries a distinguishing tag in it 3'UTR. The relative activities of the DNA sequences are inferred by counting their respective tags using reporter-specific mRNA sequencing (Tag-Seq). The pools of reporter constructs can be generated by either microarray-based DNA synthesis, or multiplexed cloning and DNA sequencing, without the need for subcloning individual constructs. In proof of principle experiments based on transient transfections, we have already shown that the Tag-Seq assay can generate highly quantitative data that is directly comparable to traditional bioluminescence-based reporter assays. Here, we propose to develop and test multiple strategies for integrating Tag-Seq reporter constructs into the genomes of ENCODE cell types, as well as to develop new constructs that will enable analysis of a wide range of functional elements and regulatory activities, including (1) distal enhancers, (2) insulators and enhancer-blockers, (3) splicing regulators, and (4) elements controlling RNA translation, stability and localization.
The goal of this project is to develop technology that will enable studies on how the rules that govern gene expression are encoded in human DNA. Insights into this question will help us understand the genetic basis for human development and disease.