Heterogeneous Cancer Progression from Microarray Data the class of diseases collectively known as cancer could in principle be produced by a limitless number of combinations of mutations. Nonetheless, it has become apparent that most cancers can be grouped into a few common """"""""sub-types,"""""""" each characterized by a common way in which the controls on cell growth become disabled. By identifying these common sub-types and the particular sequences of genetic abnormalities that produce them, we can identify patient sub-populations who may respond to different treatments than the general population, find genes that may be useful targets for new anti-cancer drugs, and develop diagnostic tests to better predict patient outcomes and suggest which drugs will benefit which patients. Great progress has been made by examining gene expression within tumors, as different cancer sub-types have characteristic patterns of overly active or overly inactive genes. Trying to interpret these expression data is, however, a difficult problem for which sophisticated computer models have proven invaluable. One class of computer models - phylogenetic (evolutionary tree) models - has provided a powerful method for interpreting likely pathways by which different cell types evolve within tumors. There are two important variants of this phylogenetic approach: one using data gathered from gene expression microarrays, which assay thousands of genes averaged over large tumor samples, and another using data gathered from cytometric studies, which assay small numbers of genes in individual cells isolated from tumors. Each has advantages, the former in allowing a far more complete picture of overall gene activity and the latter in providing valuable clues about tumor evolution by identifying which cell types co-occur in individual tumors. The proposed work will develop new computer models for these problems in order to develop a single approach with the advantages of both methods. The work will first develop approaches to infer the existence of common cell types from bulk microarray measurements of tumors sampled across patient populations. It will then build on prior methods to infer evolutionary similarity between these tumor states. It will, finally, adapt methods for cytometric tumor phylogenetics to the problem of inferring evolutionary sequences from these microarray states. The result will be a unified approach for inferring evolution among individual cell states, as in a cytometric study, but assayed on thousands of genes, as in a microarray study. The unified approach will be validated on breast cancer data, for which both microarray and cytometric measurements are available, and applied to the discovery of common progression pathways in breast cancer populations. The study can be expected to uncover distinct stages in the breast cancer progression that would not be apparent by existing methods, aiding in the identification of new patient sub-populations, drug targets, and diagnostic tests. The methods to be developed are likely to have broader applicability to solid tumor progression in general and to related problems of analyzing cell differentiation in mixed samples.

Public Health Relevance

Heterogeneous Cancer Progression from Microarray Data The proposed work will develop computer models to identify the sequences of mutations that cause once healthy cells to become cancerous and then to become increasingly aggressive over time. These models will help to identify targets for which new anti-cancer drugs might be used and to identify which patients will benefit from which drugs. The methods to be developed will be directly applied to breast cancer but should be applicable to many other forms of solid tumor.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Li, Jerry
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Carnegie-Mellon University
Schools of Arts and Sciences
United States
Zip Code
Xie, Lu; Smith, Gregory R; Schwartz, Russell (2017) Derivative-Free Optimization of Rate Parameters of Capsid Assembly Models from Bulk in Vitro Data. IEEE/ACM Trans Comput Biol Bioinform 14:844-855
Wangsa, Darawalee; Chowdhury, Salim Akhter; Ryott, Michael et al. (2016) Phylogenetic analysis of multiple FISH markers in oral tongue squamous cell carcinoma suggests that a diverse distribution of copy number changes is associated with poor prognosis. Int J Cancer 138:98-109
Catanzaro, Daniele; Shackney, Stanley E; Schaffer, Alejandro A et al. (2016) Classifying the Progression of Ductal Carcinoma from Single-Cell Sampled Data via Integer Linear Programming: A Case Study. IEEE/ACM Trans Comput Biol Bioinform 13:643-55
Roman, Theodore; Xie, Lu; Schwartz, Russell (2016) Medoidshift clustering applied to genomic bulk tumor data. BMC Genomics 17 Suppl 1:6
Subramanian, Ayshwarya; Schwartz, Russell (2015) Reference-free inference of tumor phylogenies from single-cell sequencing data. BMC Genomics 16 Suppl 11:S7
Ashktorab, Hassan; Daremipouran, Mohammad; Devaney, Joe et al. (2015) Identification of novel mutations by exome sequencing in African American colorectal cancer patients. Cancer 121:34-42
Roman, Theodore; Nayyeri, Amir; Fasy, Brittany Terese et al. (2015) A simplicial complex-based approach to unmixing tumor progression data. BMC Bioinformatics 16:254
Kang, John; Puskar, Kathleen M; Ehrlicher, Allen J et al. (2015) Structurally governed cell mechanotransduction through multiscale modeling. Sci Rep 5:8622
Chowdhury, Salim Akhter; Gertz, E Michael; Wangsa, Darawalee et al. (2015) Inferring models of multiscale copy number evolution for single-tumor phylogenetics. Bioinformatics 31:i258-67
Chowdhury, Salim Akhter; Shackney, Stanley E; Heselmeyer-Haddad, Kerstin et al. (2014) Algorithms to model single gene, single chromosome, and whole genome copy number changes jointly in tumor phylogenetics. PLoS Comput Biol 10:e1003740

Showing the most recent 10 out of 36 publications