Chromosome Conformation Capture (3C)-based technologies for detecting long-range DNA interactions have matured recently. Hi-C and its variants are currently the state-of the art for genome-wide mapping of long-range interactions. Although there is a fast growing literature on Hi-C data analysis methods, repeat sequences, which can make up a significant proportion of Hi-C reads, are excluded from these analyses. Discarding repetitive genomic content can result in the erroneous assignment of a regulatory element in a mapped region to a target that is not the bona fide target. Moreover, the regulatory elements residing in repetitive regions can control targets in mapped regions. To address these challenges, we will develop biologically motivated, statistically rigorous approaches for allocating multi-mapping reads in Hi-C data analysis.
The aims will be accomplished through a combination of methodological development, data-driven simulation, computational analysis, and experimental validation. Statistical resources generated from this project will be disseminated as open-source software. Collectively, these aims will significantly enhance the utility of Hi-C data for profiling long-range interactions of repetitive DNA.

Public Health Relevance

Genome-wide data from chromosome conformation capture and variant experiments provide overwhelming evidence that the three-dimensional organization of chromatin impacts gene regulation and genome function. One critical shortcoming of existing analytic approaches for analyzing these data is that they discard reads that align to multiple locations on the genome. This project seeks to enhance our knowledge on long- range regulatory interactions involving repetitive genomic regions by incorporating such reads into analysis.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Exploratory/Developmental Grants (R21)
Project #
5R21HG009744-02
Application #
9593026
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Gilchrist, Daniel A
Project Start
2017-11-01
Project End
2020-10-31
Budget Start
2018-11-01
Budget End
2020-10-31
Support Year
2
Fiscal Year
2019
Total Cost
Indirect Cost
Name
University of Wisconsin Madison
Department
Biostatistics & Other Math Sci
Type
Schools of Medicine
DUNS #
161202122
City
Madison
State
WI
Country
United States
Zip Code
53715