To achieve the goal of broadly impacting cancer diagnosis and prognosis, we propose feature allocation models for the inference of tumor heterogeneity (TH) using next-generation sequencing (NGS) data. Building upon the Indian buffet process (IBP) in nonparametrics Bayesian statistics, we propose posterior inference on unobserved subclones in a tumor sample at the nucleotide level. The subclones are marked by distinctive DNA sequences and copy numbers, reflecting the variations that occur during clonal expansion and tumorgenesis. We will also develop efficient computational approaches for analyzing extensive data generated from NGS experiments, paving ways for real-life applications using the proposed methods.
In Aims 1 and 2, we will focus on statistical model development accounting for noises in the NGS data and set up a scalable computation. We will generalize the classical IBP model to accommodate both categorical and dependent random matrices, giving rise to the cIBP and dIBP models.
In Aim 3, we propose a TH-based clinical trial for personalized cancer treatment. A unique feature of the trial is its comparison of the adaptive treatment strategies based on TH to a standard, fixed treatment strategy that ignores TH. We intend to develop innovative and efficient Bayesian computational approaches, apply the proposed methods using in-house and publically available genomics data, and disseminate all of the developed tools through our online portal at www.compgenome.
org (Aim 4). The proposed research will promote advancement in statistical methodology and foster development of new classes of Bayesian nonparametrics models. Further, with this type of statistical advancement, important questions on tumor heterogeneity will be addressed.
Innovative statistical models for the inference of tumor heterogeneity are expected to significantly improve medical decision making, such as individualized treatment selection for cancer patients. The improved decisions in turn will accelerate the learning of the optimal treatment strategy, thereby improving the overall quality of patient care. The combination of statistical modeling and big-data implementation in a modern electronic health system will pioneer a new generation of medical practice and is expected to drastically improve the efficiency in disease prevention, diagnosis, and prognosis.
Showing the most recent 10 out of 58 publications