In situ functional genomics to understand transcriptional regulation In the last 3 years, new gene editing technologies have revolutionized our ability to manipulate the human genome for basic research and for disease modeling. Targeted gene knock-out and precision gene repair ? previously laborious or impossible tasks in human cells ? are now routine. Despite these advances in genome surgery, much of how the genome is translated into phenotype remains a mystery and the most mysterious regions are those in the noncoding genome. Unlike with protein-coding genes, there is no Central Dogma-like framework to decipher how the noncoding genome functions. The goal of this proposal is to address a fundamental problem in transcriptional regulation: How can we identify the sequences and proteins that govern the expression of any gene, in an unbiased way? Consortium efforts like ENCODE and the Epigenomics Roadmap have produced large catalogs of biochemical hallmarks that correlate with noncoding function. However, correlation does not equal causation. Proving that certain regions of the genome regulate gene expression or act as landing pads for DNA-binding proteins requires unbiased mutagenesis and interrogation. In part, the problem has to do with size. The noncoding genome is a vast expanse: Noncoding regions constitute >98% of the 3 billion DNA bases in the human genome. We urgently need high-throughput, molecular microscopes capable of zooming in on functional regions and recording how proteins interact at these loci. Given new advances in genome engineering and high-throughput sequencing, we are in a prime position to understand the functional, gene-regulatory architecture of the noncoding genome. Here, we will apply our established expertise to examine functional regions of the noncoding genome in their endogenous context. Using human cancer and stem cell lines as model systems, we will develop five broadly- applicable cross-disciplinary platforms by harnessing recent advances in scalable DNA synthesis, genome engineering, droplet cell capture, deep sequencing and quantitative proteomics and thereby enable: 1) higher resolution noncoding CRISPR screens using Cas9 orthologs and 2) increased span (chromosome-scale) noncoding CRISPR screens using the new Cas enzyme Cpf1; 3) multidimensional readouts of entire gene networks by combining pooled mutagenesis with single-cell RNA-seq; 4) unbiased labeling of all transcription factors stationed at functional elements identified in CRISPR screens via a novel Cas9-enabled proteomic technology; and 5) applying this fleet of new technologies jointly to reveal dynamics of functional elements in neural differentiation and cancer drug resistance.

Public Health Relevance

The human genome contains many regions that do not produce proteins but instead regulate the expression of genes that produce proteins. Mutations in these regulatory regions can interfere with normal gene function and lead to different diseases, such as cancer or brain disorders. Here we present a suite of new technologies to pinpoint regulatory regions within the genome, to recognize the genes they regulate and to identify the mechanisms they use to alter gene expression.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
NIH Director’s New Innovator Awards (DP2)
Project #
Application #
Study Section
Program Officer
Pazin, Michael J
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
New York Genome Center
New York
United States
Zip Code
Li, Li; Tian, E; Chen, Xianwei et al. (2018) GFAP Mutations in Astrocytes Impair Oligodendrocyte Progenitor Proliferation and Myelination in an hiPSC Model of Alexander Disease. Cell Stem Cell 23:239-251.e6
Sanjana, Neville E (2018) A genome-wide net to catch and understand cancer. Sci Transl Med 10:
Montalbano, Antonino; Canver, Matthew C; Sanjana, Neville E (2017) High-Throughput Approaches to Pinpoint Function within the Noncoding Genome. Mol Cell 68:44-59