Mutations in gene regulatory elements are a major cause of human disease. Large-scale genomic assays, such as ChIP-seq and ATAC-seq, have identified millions of putative regulatory elements across many different cell types and tissues. Furthermore, massively parallel reporter assays (MPRAs), have allowed us to test thousands of regulatory sequences and their variants for their functional activity in a high-throughput manner. In addition, lentivirus-based MPRAs (lentiMPRAs) have enabled the testing of candidate sequences for regulatory activity with high reproducibility in hard to transfect cells and in chromatin context via genomic integration. While these assays have significantly expanded our knowledge of regulatory elements, technologies that can simultaneously analyze both the regulatory function of a specific sequence and the transcription factors, cofactors and epigenomic modifications that determine it do not exist. Here, we will develop a novel technology, crMPRA (CUT&RUN MPRA), that combines two separate techniques, lentiMPRA and cleavage under targets and release using nuclease (CUT&RUN) to simultaneously analyze in a high-throughput manner the regulatory activity, protein binding and epigenetic modification of thousands of sequences. We will take advantage of lentiMPRA both for testing thousands of candidate sequences for their regulatory activity, but also to enrich the genome with thousands of integrations of a specific sequence, such that it could be assayed for protein binding and epigenetic modifications via CUT&RUN.
In Aim 1, we will develop crMPRA, by taking advantage of sequences that were previously characterized via lentiMPRA (regulatory activity) and ChIP-seq (protein binding and epigenetic marks) in hepatocellular carcinoma HepG2 cells. We will use these sequences to build an MPRA library and characterize them for their regulatory activity via lentiMPRA. We will also simultaneously carry out CUT&RUN on specific TFs (e.g. EP300, FOXA2, HNF4A) and epigenetic marks (e.g. H3K27ac, H3K27me3).
For Aim 2, we will further test a transcription factor binding site perturbation library via crMPRA how these perturbations affect regulatory activity, protein binding and epigenetic modification, and analyze the functional correlation or independency between these states. As such, this novel technology will allow us to increase our understanding of the regulatory code at several levels including TF binding, histone modification, and transcriptional activation, and how its alteration can lead to human disease.
Mutations in gene regulatory elements, sequences that instruct genes when, where and at what levels to turn on/off genes, are a major cause of human disease. However, we currently do not have a good understanding of how they function at the molecular level including protein binding, epigenetic modification, and transcriptional activation. In this proposal, we plan to develop a novel high-throughput assay that can measure these states simultaneously, thus providing a better understanding of the regulatory code.