Since its inception in 2003, the Encyclopedia of DNA Elements (ENCODE) Consortium has made remarkable progress towards the identification of all functional elements in the human genome. However, major limitations of the current catalog are that the vast majority of elements have not been functionally characterized, the impact of genetic variation on their function is poorly defined, the precise levels of activation or repression that they confer remain unmeasured, and the specific gene(s) that they regulate are not definitively known. To address these gaps, we will implement `in genome' massively parallel functional assays to characterize over 100,000 ENCODE-based candidate regulatory elements, to confirm and quantify their activities as well as to link many of them to their target genes. In a systematic comparison of episomal vs. genomic massively parallel reporter assays (MPRA), we show that episomal assays fail to accurately capture the full patterns of regulatory activity that are observed in the context of chromatin. We therefore focus exclusively on methods that test candidate regulatory elements in an integrated, `in genome' context. First, using lentivirus-based massively parallel reporter assays, we will characterize at least 100,000 ENCODE-based regulatory elements for their promoter/enhancer activity while integrated into the genome (lentiMPRA;
Aim 1 a). Importantly, lentiMPRA can be carried out in almost every cell type and leverages ongoing developments in lentivirus technology. Early results will be used to iteratively develop models that make better selections for subsequent rounds of functional characterization. Second, we will use CRISPR/Cas9 and multiplex homology directed repair to integrate a subset of candidate enhancers to the 3' UTR of transcriptionally inactive genes, allowing us to further validate and characterize their ability to activate transcription in a natural genomic context (`in genome' STARR-seq;
Aim 1 b). Finally, we will implement a new paradigm involving CRISPR/Cas9-based multiplex genome editing followed by RNA-seq/ATAC-seq molecular profiling to characterize a genome-wide subset of candidate regulatory elements in their native genomic context for the functional consequences of mutations on them, while also determining the target gene(s) that they regulate (massively parallel genome editing;
Aim 2). Although we will initially focus our efforts on K562 and HepG2 cells, we will also perform work in other cell lines as appropriate for the needs of the ENCODE Consortium, with 25% of our capacity dedicated to a common set of elements. Combined with the efforts of the other functional characterization centers, our work will provide unprecedented `in genome' validation and characterization of ENCODE-defined candidate regulatory elements, while also facilitating insights into our understanding of the basic biology of gene regulation and how regulatory variants contribute to human disease risk.
The Encyclopedia of DNA Elements (ENCODE) project has made remarkable progress towards identifying all regulatory elements in the human genome, but nearly all of these predictions have not been functionally tested. We plan to use massively parallel reporter assays to characterize at least 100,000 ENCODE-annotated candidate regulatory elements for their function. In addition, we will use CRISPR/Cas9-based multiplex genome editing to generate mutations in thousands of these elements in their endogenous context, followed by genomic assays that will provide further validation while also linking elements to the genes that they regulate.
|Starita, Lea M; Ahituv, Nadav; Dunham, Maitreya J et al. (2017) Variant Interpretation: Functional Assays to the Rescue. Am J Hum Genet 101:315-325|