We propose to investigate improved algorithms for several challenges associated with chromosome conformation capture (3C) assays and related experiments. These recent high-throughput experiments give pairwise contact information between chromatin regions and have provided glimpses of the spatial organization of the genomes of several organisms. They have been used to computationally infer three-dimensional models of chromatin structure and to hypothesize functional spatial relationships among genomic features such as co-expressed genes, regulatory regions and their regulated genes, common breakpoint locations, and others. However, computational tools for structural modeling, relating function to structure, and for visualizing 3C data are still lacking. This proposal seeks to develop computational tools for several central 3C analysis tasks.
In Aim 1, we propose coupling sampling with an optimization framework to model populations of chromatin structures that are consistent with 3C data. This is essential because 3C provides an average over different structures in millions of cells.
In Aim 2, we devise techniques to find common and different structural features within these ensembles, comparing structures of different cell types (e.g. cancer vs. normal;lyphoblastoid vs. fibroblast) and better techniques t identify genomic loci that are statistically significantly spatially co-located. Finally, in Aim 3 e propose to develop a """"""""spatial genome browser"""""""" that integrates both 1-d genomic annotations (genes, methylation, DNAase accessibility, etc.) with 3C spatial data. We will apply these techniques to quantifying the amount of cell-to-cell and cell-type variation in human, yeast, and mouse. Using improved populations of models, we will identify new instances of long-range regulation and explain existing postulated distal enhancer-promoter interactions. We will also correlate structure with eQTLs and GWAS-identified SNPs to explain the mechanism causing the eQTL and the effect of the SNP. Finally, we will search for relationships between co-expressed genes and spatial proximity. The techniques we propose will result in better structural models computed more efficiently and a better understanding of the relationships between structure and function.

Public Health Relevance

The large-scale spatial arrangement of chromosomes affects DNA replication and evolution and has been implicated in the development of several types of cancer. Detailed estimates of chromatin structure are now possible with chromatin conformation capture experiments coupled with computational analysis of their results. The new algorithms and software proposed here will advance that computational analysis, with a particular emphasis on controlling population effects present in recent chromatin conformation capture experiments, leading to more certain associations between chromatin structure and disease.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project (R01)
Project #
5R01HG007104-02
Application #
8739540
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Good, Peter J
Project Start
2013-09-23
Project End
2016-06-30
Budget Start
2014-07-01
Budget End
2015-06-30
Support Year
2
Fiscal Year
2014
Total Cost
Indirect Cost
Name
Carnegie-Mellon University
Department
Type
Schools of Arts and Sciences
DUNS #
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213
Wang, Hao; Kingsford, Carl; McManus, C Joel (2018) Using the Ribodeblur pipeline to recover A-sites from yeast ribosome profiling data. Methods 137:67-70
Lee, Heewook; Kingsford, Carl (2018) Kourami: graph-guided assembly for novel human leukocyte antigen allele discovery. Genome Biol 19:16
Lee, Heewook; Kingsford, Carl (2018) Accurate Assembly and Typing of HLA using a Graph-Guided Assembler Kourami. Methods Mol Biol 1802:235-247
Sauerwald, Natalie; Kingsford, Carl (2018) Quantifying the similarity of topological domains across normal and cancer human cell types. Bioinformatics 34:i475-i483
Shao, Mingfu; Kingsford, Carl (2017) Accurate assembly of transcripts through phase-preserving graph decomposition. Nat Biotechnol 35:1167-1169
Shao, Mingfu; Ma, Jianzhu; Wang, Sheng (2017) DeepBound: accurate identification of transcript boundaries via deep convolutional neural fields. Bioinformatics 33:i267-i273
Patro, Rob; Duggal, Geet; Love, Michael I et al. (2017) Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods 14:417-419
Marçais, Guillaume; Pellow, David; Bork, Daniel et al. (2017) Improving the performance of minimizers and winnowing schemes. Bioinformatics 33:i110-i117
Kram, Karin E; Geiger, Christopher; Ismail, Wazim Mohammed et al. (2017) Adaptation ofEscherichia colito Long-Term Serial Passage in Complex Medium: Evidence of Parallel Evolution. mSystems 2:
Sauerwald, Natalie; Zhang, She; Kingsford, Carl et al. (2017) Chromosomal dynamics predicted by an elastic network model explains genome-wide accessibility and long-range couplings. Nucleic Acids Res 45:3663-3673

Showing the most recent 10 out of 23 publications