We are poised to enter a new era of conformational biology. Genome conformation is critical for numerous cellular processes, including gene regulation, with certain alterations (translocations, fu- sions) being oncogenic. While recent assays, notably Hi-C, have already transformed understanding of chromatin architecture, even newer technologies have the potential to dramatically improve accuracy and resolution of three-dimensional (3D) genome reconstructions. However, to fully realize this potential, new statistical methods and algorithms will be required to operate on the resultant data and structures, and to integrate concomitant biomedical data. This project aims at developing such methods. A concrete example is provided by current findings identifying an instance of insulated neighborhood disruption as a novel oncogenic mechanism. Instead of an individual instance, we will develop methods to detect, and prioritize, genome-wide candidates, building on our previous work on 3D hotspot elicitation. In particular, we will devise original reconstruction-free approaches to avert uncertainties in inferring architecture. Despite these uncertainties, reconstructions confer several advantages. We will deploy newly devised assays, in conjunction with recent algorithmic advances, to improve reconstruction accuracy and resolution. Multiplexed FISH provides richer imaging of chromatin conformation, enabling refinement of transfer functions linking Hi-C contacts to distances, a precursor to reconstruction. Protein-centric HiChIP provides gains in informative reads, as does multi-read rescue. Combining these advances will produce enhanced approaches to 3D genome reconstruction. The very notion of ?a? 3D genome reconstruction has been questioned since the underlying Hi- C assays are based on large cell populations. Multiplexed in situ Hi-C has enabled generation of thousands of single-cell datasets which we will couple with a new multi-track reconstruction algorithm to dissect inter-cellular structural heterogeneity. We will also use this data to develop classifiers, based on structural differences, for between cell-type discrimination. Much downstream interpretation of Hi-C data has derived from spectral analysis of the contact matrix, especially delineation of chromatin compartments. Spectral summarization has limitations including compartment identification at high resolution, sensitivity to normalization, and extent of explained variation. We will evaluate spectral analysis of contact matrices with emphasis on the impact of approximations on 3D reconstructions, assessed via (i) inferred distance matrices, (ii) derived reconstructions, and (iii) subsequent hotspot detection.

Public Health Relevance

The three dimensional (3D) configuration of chromatin is critical for numerous cellular processes with, for example, attendant insulated neighborhood disruption being a newly proposed oncogenic driver. Several recent improvements to chromatin conformation capture (Hi-C) assays, along with advances to reconstruction algorithms, greatly enhance the potential for accurately inferring 3D chromatin organization at high resolution. By integrating these developments, including high-throughput single-cell Hi-C, with genome-indexed attributes we will provide tools for downstream analyses that, among other capabilities, can prioritize candidate oncogenic disrupted neighborhoods, dissect inter-cellular structural heterogeneity, and provide a conformation view point for parsing results of genome-wide studies.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM109457-07
Application #
9773120
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Lyster, Peter
Project Start
2013-09-01
Project End
2021-08-31
Budget Start
2019-09-01
Budget End
2020-08-31
Support Year
7
Fiscal Year
2019
Total Cost
Indirect Cost
Name
University of California San Francisco
Department
Public Health & Prev Medicine
Type
Schools of Medicine
DUNS #
094878337
City
San Francisco
State
CA
Country
United States
Zip Code
94118
Segal, Mark R; Bengtsson, Henrik L (2018) Improved accuracy assessment for 3D genome reconstructions. BMC Bioinformatics 19:196
Stolz, Robert; Yoshida, Masaaki; Brasher, Reuben et al. (2017) Pathways of DNA unlinking: A story of stepwise simplification. Sci Rep 7:12420
Lee, Cheng-Sheng; Wang, Ruoxi W; Chang, Hsiao-Han et al. (2016) Chromosome position determines the success of double-strand break repair. Proc Natl Acad Sci U S A 113:E146-54
Capurso, Daniel; Bengtsson, Henrik; Segal, Mark R (2016) Discovering hotspots in functional genomic data superposed on 3D chromatin configuration reconstructions. Nucleic Acids Res 44:2028-35
Arsuaga, Javier; Jayasinghe, Reyka G; Scharein, Robert G et al. (2015) Current theoretical models fail to predict the topological complexity of the human genome. Front Mol Biosci 2:48
Diao, Yuanan; Rodriguez, Victor; Klingbeil, Michele et al. (2015) Orientation of DNA Minicircles Balances Density and Topological Complexity in Kinetoplast DNA. PLoS One 10:e0130998
Segal, Mark R; Bengtsson, Henrik L (2015) Reconstruction of 3D genome architecture via a two-stage algorithm. BMC Bioinformatics 16:373
Segal, Mark R; Xiong, Hao; Capurso, Daniel et al. (2014) Reproducibility of 3D chromatin configuration reconstructions. Biostatistics 15:442-56
Capurso, Daniel; Segal, Mark R (2014) Distance-based assessment of the localization of functional annotations in 3D genome reconstructions. BMC Genomics 15:992