Since its inception in 2003, the Encyclopedia of DNA Elements (ENCODE) Consortium has made remarkable progress towards the identification of all functional elements in the human genome. However, major limitations of the current catalog are that the vast majority of elements have not been functionally characterized, the impact of genetic variation on their function is poorly defined, the precise levels of activation or repression that they confer remain unmeasured, and the specific gene(s) that they regulate are not definitively known. To address these gaps, we will implement `in genome' massively parallel functional assays to characterize over 100,000 ENCODE-based candidate regulatory elements, to confirm and quantify their activities as well as to link many of them to their target genes. In a systematic comparison of episomal vs. genomic massively parallel reporter assays (MPRA), we show that episomal assays fail to accurately capture the full patterns of regulatory activity that are observed in the context of chromatin. We therefore focus exclusively on methods that test candidate regulatory elements in an integrated, `in genome' context. First, using lentivirus-based massively parallel reporter assays, we will characterize at least 100,000 ENCODE-based regulatory elements for their promoter/enhancer activity while integrated into the genome (lentiMPRA;
Aim 1 a). Importantly, lentiMPRA can be carried out in almost every cell type and leverages ongoing developments in lentivirus technology. Early results will be used to iteratively develop models that make better selections for subsequent rounds of functional characterization. Second, we will use CRISPR/Cas9 and multiplex homology directed repair to integrate a subset of candidate enhancers to the 3' UTR of transcriptionally inactive genes, allowing us to further validate and characterize their ability to activate transcription in a natural genomic context (`in genome' STARR-seq;
Aim 1 b). Finally, we will implement a new paradigm involving CRISPR/Cas9-based multiplex genome editing followed by RNA-seq/ATAC-seq molecular profiling to characterize a genome-wide subset of candidate regulatory elements in their native genomic context for the functional consequences of mutations on them, while also determining the target gene(s) that they regulate (massively parallel genome editing;
Aim 2). Although we will initially focus our efforts on K562 and HepG2 cells, we will also perform work in other cell lines as appropriate for the needs of the ENCODE Consortium, with 25% of our capacity dedicated to a common set of elements. Combined with the efforts of the other functional characterization centers, our work will provide unprecedented `in genome' validation and characterization of ENCODE-defined candidate regulatory elements, while also facilitating insights into our understanding of the basic biology of gene regulation and how regulatory variants contribute to human disease risk.

Public Health Relevance

The Encyclopedia of DNA Elements (ENCODE) project has made remarkable progress towards identifying all regulatory elements in the human genome, but nearly all of these predictions have not been functionally tested. We plan to use massively parallel reporter assays to characterize at least 100,000 ENCODE-annotated candidate regulatory elements for their function. In addition, we will use CRISPR/Cas9-based multiplex genome editing to generate mutations in thousands of these elements in their endogenous context, followed by genomic assays that will provide further validation while also linking elements to the genes that they regulate.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project with Complex Structure Cooperative Agreement (UM1)
Project #
1UM1HG009408-01
Application #
9247479
Study Section
Special Emphasis Panel (ZHG1-HGR-L (O1))
Program Officer
Pazin, Michael J
Project Start
2017-02-01
Project End
2021-01-31
Budget Start
2017-02-01
Budget End
2018-01-31
Support Year
1
Fiscal Year
2017
Total Cost
$1,091,663
Indirect Cost
$198,282
Name
University of California San Francisco
Department
Pharmacology
Type
Schools of Pharmacy
DUNS #
094878337
City
San Francisco
State
CA
Country
United States
Zip Code
94118
Starita, Lea M; Ahituv, Nadav; Dunham, Maitreya J et al. (2017) Variant Interpretation: Functional Assays to the Rescue. Am J Hum Genet 101:315-325