Most of the 1000s of sequencing experiments generated by ENCODE provide 1D readouts of the epigenetic landscape or transcriptional output of a 3D genome. New chromosome conformation capture (3C) technologies ? in particular Hi-C and ChIA-PET ? have begun to provide insight into the hierarchical 3D organization of the genome: the partition of chromosomes into open and closed compartments; the existence of structural subunits defined as topologically associated domains (TADs); and the presence of regulatory and structural DNA loops within TADs. New experimental evidence using CRISPR/Cas-mediated genome editing suggests that disruption of local 3D structure can alter regulation of neighboring genes, and there have been early efforts to use data on 3D DNA looping to predict the impact of non-coding SNPs from GWAS studies. The goal of this proposal is to develop new integrative computational methods to interpret large-scale ENCODE 1D epigenomic and transcriptomic resources in light of the underlying 3D architecture of the genome. Members of our team have pioneered powerful methods to infer local chromatin states from a 1D viewpoint through the Segway suite. We have also analyzed the 1D organization of chromatin accessible elements and their lineage dynamics to define the concept of regulatory complexity, and we presented a gene regulation model to predict gene expression changes in differentiation from the DNA content of active enhancers. Here we will build on these efforts to learn chromatin state and gene regulation models that incorporate information on hierarchical 3D genomic architecture, enabling us to predict how individual structural/regulatory elements contribute to 3D DNA looping and to gene expression. Mechanistic predictions will be experimentally validated in their native cell-type specific chromatin context using state-of-the-art genome editing, exploiting computational and experimental CRISPR/Cas tools developed by our team.
This project develops advanced computational methods for integrating information on the 3D structure of the human genome with large-scale genomics data sets generated by the ENCODE project to gain insight into cell-type specific chromatin state and gene regulation. These studies have broad relevance for understanding the regulation of gene expression in human cells and the disruption of gene expression programs in disease.
|Chan, Rachel C W; Libbrecht, Maxwell W; Roberts, Eric G et al. (2018) Segway 2.0: Gaussian mixture models and minibatch training. Bioinformatics 34:669-671|
|Carty, Mark; Zamparo, Lee; Sahin, Merve et al. (2017) An integrated model for detecting significant chromatin interactions from high-resolution Hi-C data. Nat Commun 8:15454|
|Perez, Alexendar R; Pritykin, Yuri; Vidigal, Joana A et al. (2017) GuideScan software for improved single and paired CRISPR guide RNA design. Nat Biotechnol 35:347-349|