We propose to investigate improved algorithms for several challenges associated with chromosome conformation capture (3C) assays and related experiments. These recent high-throughput experiments give pairwise contact information between chromatin regions and have provided glimpses of the spatial organization of the genomes of several organisms. They have been used to computationally infer three-dimensional models of chromatin structure and to hypothesize functional spatial relationships among genomic features such as co-expressed genes, regulatory regions and their regulated genes, common breakpoint locations, and others. However, computational tools for structural modeling, relating function to structure, and for visualizing 3C data are still lacking. This proposal seeks to develop computational tools for several central 3C analysis tasks.
In Aim 1, we propose coupling sampling with an optimization framework to model populations of chromatin structures that are consistent with 3C data. This is essential because 3C provides an average over different structures in millions of cells.
In Aim 2, we devise techniques to find common and different structural features within these ensembles, comparing structures of different cell types (e.g. cancer vs. normal;lyphoblastoid vs. fibroblast) and better techniques t identify genomic loci that are statistically significantly spatially co-located. Finally, in Aim 3 e propose to develop a """"""""spatial genome browser"""""""" that integrates both 1-d genomic annotations (genes, methylation, DNAase accessibility, etc.) with 3C spatial data. We will apply these techniques to quantifying the amount of cell-to-cell and cell-type variation in human, yeast, and mouse. Using improved populations of models, we will identify new instances of long-range regulation and explain existing postulated distal enhancer-promoter interactions. We will also correlate structure with eQTLs and GWAS-identified SNPs to explain the mechanism causing the eQTL and the effect of the SNP. Finally, we will search for relationships between co-expressed genes and spatial proximity. The techniques we propose will result in better structural models computed more efficiently and a better understanding of the relationships between structure and function.

Public Health Relevance

The large-scale spatial arrangement of chromosomes affects DNA replication and evolution and has been implicated in the development of several types of cancer. Detailed estimates of chromatin structure are now possible with chromatin conformation capture experiments coupled with computational analysis of their results. The new algorithms and software proposed here will advance that computational analysis, with a particular emphasis on controlling population effects present in recent chromatin conformation capture experiments, leading to more certain associations between chromatin structure and disease.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Good, Peter J
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Carnegie-Mellon University
Schools of Arts and Sciences
United States
Zip Code
Karathia, Hiren; Kingsford, Carl; Girvan, Michelle et al. (2016) A pathway-centric view of spatial proximity in the 3D nucleome across cell lines. Sci Rep 6:39279
Sefer, Emre; Kingsford, Carl (2016) Diffusion archeology for diffusion progression history reconstruction. Knowl Inf Syst 49:403-427
Spealman, Pieter; Wang, Hao; May, Gemma et al. (2016) Exploring Ribosome Positioning on Translating Transcripts with Ribosome Profiling. Methods Mol Biol 1358:71-97
Solomon, Brad; Kingsford, Carl (2016) Fast search of thousands of short-read sequencing experiments. Nat Biotechnol 34:300-2
Patro, Rob; Norel, Raquel; Prill, Robert J et al. (2016) A computational method for designing diverse linear epitopes including citrullinated peptides with desired binding affinities to intravenous immunoglobulin. BMC Bioinformatics 17:155
Xin, Hongyi; Greth, John; Emmons, John et al. (2015) Shifted Hamming distance: a fast and accurate SIMD-friendly filter to accelerate alignment verification in read mapping. Bioinformatics 31:1553-60
Patro, Rob; Kingsford, Carl (2015) Data-dependent bucketing improves reference-free compression of sequencing reads. Bioinformatics 31:2770-7
Kingsford, Carl; Patro, Rob (2015) Reference-based compression of short-read sequences using path encoding. Bioinformatics 31:1920-8
Filippova, Darya; Patro, Rob; Duggal, Geet et al. (2014) Identification of alternative topological domains in chromatin. Algorithms Mol Biol 9:14
Duggal, Geet; Wang, Hao; Kingsford, Carl (2014) Higher-order chromatin domains link eQTLs with the expression of far-away genes. Nucleic Acids Res 42:87-96

Showing the most recent 10 out of 11 publications