Cancer genomes present amazing puzzles for genomicists to solve in terms of their structures. The size of the data pieces for attempting assembly into a complete view are both very large (cytogenetic) and very small (sequence data), often differing in scale by more than a thousand-fold. Add to this, genomic dispersity within a tumor and breakpoints within interspersed repeats, and the puzzle solution grows much more difficult. As such, the aims of this application are to effectively seamlessly scale data piece size by a hierarchical framework, through new algorithms and computational pipelines that will engage both long-range physical maps constructed by significant advancements to Nanocoding, and sequence data to create scalable views of cancer genomes that span from nucleotide-to-chromosome. This multipronged project will involve synergistic advancements to: DNA labeling, presentation of very large genomic DNA molecules, scanners for single molecule analytes, and machine vision-all system components that will be informed by advanced bioinformatic analysis techniques, developed for single molecule analysis, and cutting-edge computer simulations of DNA conformations within the devices that will be the foundry for large datasets. This highly integrated system will be aimed at the discovery of novel structural variants within four paired multiple myeloma / normal samples for tabulation of previously undetectable events as candidates for validation and further study. The resulting platform, comprising new single molecule technologies, melded with advanced bioinformatics techniques, portends scalable, comprehensive, fast genome analysis for navigating cancer genomes.
Cancer genomes are very difficult to finely view as a whole. We are proposing to build an integrated system that will analyze very large chunks of genetic material, which will allow entire cancer genomes to be viewed like Google Earth: The ability to seamlessly zoom from a planet view to a street corner.
Rajaraman, Ashok; Ma, Jian (2018) Toward Recovering Allele-specific Cancer Genome Graphs. J Comput Biol 25:624-636 |
Kounovsky-Shafer, Kristy L; Hernandez-Ortiz, Juan P; Potamousis, Konstantinos et al. (2017) Electrostatic confinement and manipulation of DNA molecules for genome analysis. Proc Natl Acad Sci U S A 114:13400-13405 |
Hou, Jack P; Emad, Amin; Puleo, Gregory J et al. (2016) A new correlation clustering method for cancer mutation analysis. Bioinformatics 32:3717-3728 |
Tian, Dechao; Gu, Quanquan; Ma, Jian (2016) Identifying gene regulatory network rewiring using latent differential graphical models. Nucleic Acids Res 44:e140 |
He, Feifei; Li, Yang; Tang, Yu-Hang et al. (2016) Identifying micro-inversions using high-throughput sequencing reads. BMC Genomics 17 Suppl 1:4 |
Li, Yang; Zhou, Shiguo; Schwartz, David C et al. (2016) Allele-Specific Quantification of Structural Variations in Cancer Genomes. Cell Syst 3:21-34 |
Lee, Seonghyun; Oh, Yeeun; Lee, Jungyoon et al. (2016) DNA binding fluorescent proteins for the direct visualization of large DNA molecules. Nucleic Acids Res 44:e6 |
Gupta, Aditya; Place, Michael; Goldstein, Steven et al. (2015) Single-molecule analysis reveals widespread structural variation in multiple myeloma. Proc Natl Acad Sci U S A 112:7689-94 |