We propose to investigate improved algorithms for several challenges associated with chromosome conformation capture (3C) assays and related experiments. These recent high-throughput experiments give pairwise contact information between chromatin regions and have provided glimpses of the spatial organization of the genomes of several organisms. They have been used to computationally infer three-dimensional models of chromatin structure and to hypothesize functional spatial relationships among genomic features such as co-expressed genes, regulatory regions and their regulated genes, common breakpoint locations, and others. However, computational tools for structural modeling, relating function to structure, and for visualizing 3C data are still lacking. This proposal seeks to develop computational tools for several central 3C analysis tasks.
In Aim 1, we propose coupling sampling with an optimization framework to model populations of chromatin structures that are consistent with 3C data. This is essential because 3C provides an average over different structures in millions of cells.
In Aim 2, we devise techniques to find common and different structural features within these ensembles, comparing structures of different cell types (e.g. cancer vs. normal;lyphoblastoid vs. fibroblast) and better techniques t identify genomic loci that are statistically significantly spatially co-located. Finally, in Aim 3 e propose to develop a """"""""spatial genome browser"""""""" that integrates both 1-d genomic annotations (genes, methylation, DNAase accessibility, etc.) with 3C spatial data. We will apply these techniques to quantifying the amount of cell-to-cell and cell-type variation in human, yeast, and mouse. Using improved populations of models, we will identify new instances of long-range regulation and explain existing postulated distal enhancer-promoter interactions. We will also correlate structure with eQTLs and GWAS-identified SNPs to explain the mechanism causing the eQTL and the effect of the SNP. Finally, we will search for relationships between co-expressed genes and spatial proximity. The techniques we propose will result in better structural models computed more efficiently and a better understanding of the relationships between structure and function.

Public Health Relevance

The large-scale spatial arrangement of chromosomes affects DNA replication and evolution and has been implicated in the development of several types of cancer. Detailed estimates of chromatin structure are now possible with chromatin conformation capture experiments coupled with computational analysis of their results. The new algorithms and software proposed here will advance that computational analysis, with a particular emphasis on controlling population effects present in recent chromatin conformation capture experiments, leading to more certain associations between chromatin structure and disease.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Good, Peter J
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Carnegie-Mellon University
Schools of Arts and Sciences
United States
Zip Code
Shao, Mingfu; Kingsford, Carl (2017) Accurate assembly of transcripts through phase-preserving graph decomposition. Nat Biotechnol 35:1167-1169
Patro, Rob; Duggal, Geet; Love, Michael I et al. (2017) Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods 14:417-419
Kram, Karin E; Geiger, Christopher; Ismail, Wazim Mohammed et al. (2017) Adaptation of Escherichia coli to Long-Term Serial Passage in Complex Medium: Evidence of Parallel Evolution. mSystems 2:
Sauerwald, Natalie; Zhang, She; Kingsford, Carl et al. (2017) Chromosomal dynamics predicted by an elastic network model explains genome-wide accessibility and long-range couplings. Nucleic Acids Res 45:3663-3673
Marçais, Guillaume; Pellow, David; Bork, Daniel et al. (2017) Improving the performance of minimizers and winnowing schemes. Bioinformatics 33:i110-i117
Sefer, Emre; Kingsford, Carl (2016) Diffusion archeology for diffusion progression history reconstruction. Knowl Inf Syst 49:403-427
Wang, Hao; McManus, Joel; Kingsford, Carl (2016) Isoform-level ribosome occupancy estimation guided by transcript abundance with Ribomap. Bioinformatics 32:1880-2
Spealman, Pieter; Wang, Hao; May, Gemma et al. (2016) Exploring Ribosome Positioning on Translating Transcripts with Ribosome Profiling. Methods Mol Biol 1358:71-97
Solomon, Brad; Kingsford, Carl (2016) Fast search of thousands of short-read sequencing experiments. Nat Biotechnol 34:300-2
Sefer, Emre; Duggal, Geet; Kingsford, Carl (2016) Deconvolution of Ensemble Chromatin Interaction Data Reveals the Latent Mixing Structures in Cell Subpopulations. J Comput Biol 23:425-38

Showing the most recent 10 out of 18 publications