As single-cell genomic technologies have matured, it has become apparent that cell-to-cell heterogeneity is a ubiquitous feature of multicellular systems with far-reaching consequences for health and disease. Cell-to-cell genomic heterogeneity has become a research focus in numerous biomedical contexts, including tissue development, regeneration, neural activity, infectious disease dynamics, and immune response. Cellular heterogeneity has been best studied in cancers, where it is now understood to be both common and crucial to tumor growth, progression, metastasis, and therapeutic response. For all its promise, however, there are substantial hurdles between current basic research and translational applications of single-cell genomics. In particular, single-cell genomic profiling is still orders of magnitude too costly to perform at the scales needed to profile cellular heterogeneity in more than small groups of study subjects, making it unusable for identifying statistically robust features of heterogeneity across patient cohorts, much less for routine clinical use. A more cost-effective alternative to true single-cell genomics is the computational strategy of genomic deconvolution, which uses bulk genomic measurements from tissue samples to infer the genomic signals of major clones shared among those samples. Such methods have seen a recent flurry of interest in both the computational biology developer community and in analysis pipelines of major cancer research sequencing efforts as a cost-effective way of profiling intratumor heterogeneity. They have substantial disadvantages relative to true single-cell data, though, as they provide generally only very imprecise reconstructions of a few high-abundance clonal populations. They are especially ill-suited to deconvolving genomic signals in the presence of copy number variations (CNVs), the major mechanism of progression in most solid tumors. The proposed work will seek to address the need for accurate but cost-effective methods for studying cellular heterogeneity by combining the advantages of true single-cell genomics with computational deconvolution. The work will develop methods for using limited amounts of single-cell data to enhance the accuracy and resolution of computational deconvolution of bulk data at a fraction of the cost of true single-cell profiling. It will develop this direction, with specific focus on the cancer context, in two major variants: one combining bulk and single-cell sequencing (scSeq) data and the other combining bulk sequencing with fluorescence in situ hybridization (FISH), a technology that can characterize clonal populations at limited numbers of CNV markers per cell but in far greater numbers of cells than is practical for scSeq. It will validate the resulting methods on tumor samples for which bulk, single-cell, and FISH profiles are all available. The resulting methods will provide a way to make single-cell genomics practical today on the scales needed for robust statistical analysis of subject cohorts and for potential future clinical applications.
Cell-to-cell genomic heterogeneity is a ubiquitous feature of complex tissues with broad significance to numerous contexts in biology and medicine. Methods to probe genome heterogeneity at the cellular level are only practical on small scales, however, limiting possibilities for bringing single-cell genomics into translational applications. The proposed work will develop a computational strategy to meet the need for greatly expanding the practical scale of single-cell genomic methods, with a particular focus on cancer biology.
Eaton, Jesse; Wang, Jingyi; Schwartz, Russell (2018) Deconvolution and phylogeny inference of structural variations in tumor genomic samples. Bioinformatics 34:i357-i365 |
Oltmann, Johanna; Heselmeyer-Haddad, Kerstin; Hernandez, Leanora S et al. (2018) Aneuploidy, TP53 mutation, and amplification of MYC correlate with increased intratumor heterogeneity and poor prognosis of breast cancer patients. Genes Chromosomes Cancer 57:165-175 |
Thomas, Marcus; Schwartz, Russell (2018) A method for efficient Bayesian optimization of self-assembly systems from scattering data. BMC Syst Biol 12:65 |
Kang, John; Rancati, Tiziana; Lee, Sangkyu et al. (2018) Machine Learning and Radiogenomics: Lessons Learned and Future Directions. Front Oncol 8:228 |
Roman, Theodore; Xie, Lu; Schwartz, Russell (2017) Automated deconvolution of structured mixtures from heterogeneous tumor genomic data. PLoS Comput Biol 13:e1005815 |