The ENCODE project has produced high-resolution, high-quality maps of components of the `regulome' in a set of tissues and cell lines, identifying a collection of putative regulatory elements. Our proposal aims to test the functional relevance of these putative elements with high-throughput, pooled CRISPR screens. This powerful platform will allow us to up-regulate, down-regulate, or mutate specific regulatory elements, and then probe the effects of these perturbations on cell survival under normal growth conditions, and a variety of stress conditions (oxidative stress, ricin toxicity, and nutrient deprivation) that produce differential sensitivities to gene expression. To establish our targets, we will use ENCODE data in concert with other consortia-generated data using an integrative analysis pipeline that leverages both correlation between element activity and gene expression, and higher order chromatin interactions to link functional elements with potential target genes. For ~3000 genes for which we have observed that perturbation affects proliferation, we will generate multiple libraries of ~100,000 guide RNAs for redundantly perturbing 20,000 enhancers linked to these genes through our analysis. After identifying ?hits? in this screen by sequencing the guide RNA libraries before and after proliferation under our test conditions to observe a reduction in specific guides, we will also test combinations of enhancer elements that may act in a cooperative or redundant fashion, exploring the functional linkages with a specific focus on superenhancers and their sub-elements. With these phenotypic validation data in hand, we will carry out molecular mechanistic validation by choosing 100 individual elements, and 50 combinations of elements to generate stable cell lines with engineered genetic ablation of elements, and carry out genome- wide molecular characterization of accessible chromatin (ATAC-seq), chromatin looping (HiChIA, a novel, high- efficiency chromatin looping assay), and gene expression (RNA-seq). We will also generate and assess 50 lines ablating entire superenhancers and individual superenhancer elements ? both individually in in combination. For a subset of these lines, we will carry out single cell ATAC-seq to unravel effects on variations of open chromatin within the population of cells. We will also compare results from pooled CRISPR expression reporter assays to our method of generating edits in the native genomic context. Analyzing these data using powerful, integrative analysis methods, scaffolded from ENCODE data, will generate global maps of the molecular consequences of deletions at the level of chromatin and gene expression changes. These data will be rapidly released to the community, and all techniques and cell lines will be made available to the ENCODE consortium and to the genomics community at large. This project will deliver an immense corpus of functional data linking regulatory elements to genes, as well as extensive molecular characterization of a subset of these regulatory elements, providing a scaffold for understanding classes and logic of functional elements, validating computational predictions, and providing techniques that broadly extensible to other cell types and tissues.

Public Health Relevance

A number of large international consortia, including the ENCODE project, have produced prodigious data regarding the molecular features of different elements of the genome. Some of these elements have features associated with function, but mere associate does not prove that these elements indeed have functional effects on cell phenotype. This project will test these putative functional elements with powerful, high-throughput gene editing technologies coupled with high-throughput genome wide characterization of chromatin structure and gene expression, providing a validation as to the regulatory potential of these identified genomic elements.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project with Complex Structure Cooperative Agreement (UM1)
Project #
1UM1HG009436-01
Application #
9247643
Study Section
Special Emphasis Panel (ZHG1-HGR-L (O1))
Program Officer
Pazin, Michael J
Project Start
2017-02-01
Project End
2021-01-31
Budget Start
2017-02-01
Budget End
2018-01-31
Support Year
1
Fiscal Year
2017
Total Cost
$944,486
Indirect Cost
$265,707
Name
Stanford University
Department
Genetics
Type
Schools of Medicine
DUNS #
009214214
City
Stanford
State
CA
Country
United States
Zip Code
94304
Corces, M Ryan; Trevino, Alexandro E; Hamilton, Emily G et al. (2017) An improved ATAC-seq protocol reduces background and enables interrogation of frozen tissues. Nat Methods 14:959-962
Schep, Alicia N; Wu, Beijing; Buenrostro, Jason D et al. (2017) chromVAR: inferring transcription-factor-associated accessibility from single-cell epigenomic data. Nat Methods 14:975-978
Boyle, Evan A; Andreasson, Johan O L; Chircus, Lauren M et al. (2017) High-throughput biochemical profiling reveals sequence determinants of dCas9 off-target binding and unbinding. Proc Natl Acad Sci U S A 114:5461-5466