Tumor heterogeneity refers to the observation that tumor cells can display distinct phenotypic and morphological characteristics, such as gene expression, metabolism, cellular morphology proliferation, motility, and metastatic potential. This phenomenon occurs both between tumors (inter-tumor heterogeneity) and within tumors (intra-tumor heterogeneity). Understanding tumor heterogeneity is a prerequisite for personalized tumor diagnosis and treatment. The objective of this project is to develop statistical methods to study the intra-tumor heterogeneity using next generation sequencing data. These methods will address important statistical, computational, and biological challenges that arise from recent cancer genomics studies. The applications will further our understanding of mechanisms underlying tumor heterogeneity as well as its clinical consequences. The research will provide opportunities to attract and nurture diverse future scientists to work at the frontiers of computational cancer genomics. The project will also provide research training opportunities for undergraduate and graduate students. User-friendly open-source software implementing the research methods will be developed, distributed, and supported to benefit the genomics and statistics community.
This project will develop clonality analysis methods with a firm statistical footing and tailored for the characteristics of data from different sequencing technologies. A new stochastic process will be developed to model clonal expansion. Using likelihood-based approaches, the PI will construct clonal history empowered by several novel strategies: 1) incorporating phase information bridges cancer genomics studies with rich germline variant resources profiled by consortium such as the 1000 Genome Project; (2) integrating single cell RNA sequencing and bulk tissue DNA sequencing data; (3) adjusting for characteristics of local sequences. These analyses can also potentially open up new opportunities for joint analysis of tumor heterogeneity and a rich list of epigenomic features such as DNA accessibility and methylation. Successful achievement of all aims will dramatically increase the power of subclone identification in cancer genomics studies.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.