Cancer is a disease that emerges from a single cell in the somatic tissue and is driven by a complex interplay of somatic mutations, copy number aberrations and chromosomal rearrangements. As a tumor progresses, diverse genomic aberrations give rise to genetically heterogeneous subpopulations (clones) of cells that interact with each other in a Darwinian framework of mutations, fitness and selection. Intra-tumor heterogeneity (ITH) complicates the diagnosis and treatment of cancer patients and causes relapse and drug resistance. The emergence of next-generation sequencing (NGS) technologies has enabled a thorough analysis of tumor heterogeneity through the generation of large-scale quantitative genomic datasets. However, despite these advances, a comprehensive understanding of intra-tumor heterogeneity has proved elusive thus far.

The project entails devising new mathematical formulations and developing new algorithmic techniques for inferring evolutionary histories of tumor cells in order to understand tumor evolution and ITH. In particular, all the models and methods target single-cell sequencing (SCS) data and account simultaneously for variant calling and genotyping, as well as evolutionary history inference. This simultaneous inference accounts in a principled manner for error in SCS data. Models based on both finite- and infinite-sites assumptions will be derived. The research is highly interdisciplinary, using combinatorial optimization and graph theory, as well as statistical estimation, to provide practical solutions to problems that arise in cancer genomics. While a major impact of the project will be on cancer biology, the project will also stimulate research at the intersection of mathematics, computer science and statistics. The project will provide training opportunities for students in a truly interdisciplinary setting. The project will produce freely available open-source software. The research results will be integrated into courses and disseminated through publications. All models and algorithms will be implemented in open-source software, and disseminated via Github and Bitbucket.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

National Science Foundation (NSF)
Division of Information and Intelligent Systems (IIS)
Standard Grant (Standard)
Application #
Program Officer
Sylvia J. Spengler
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Rice University
United States
Zip Code