This project aims to develop two novel informatics tools to enable functional integration of autologous whole exome DNA (WEX) and whole transcriptome RNA sequencing (WTX) data that are being systematically generated in cancer research and diagnostic labs. 1) Texomer will deconvolute the tumor genomic and transcriptomic profiles simultaneously from autologous bulk whole exome (WES) and whole transcriptome sequencing (WTS) data, and identify functional variants through genome-transcriptome integrative analysis. It will estimate tumor purity and intra-tumor heterogeneity in both DNA and RNA data, quantify tumor-allele-specific copy number (ASCN) profiles and tumor allele-specific expression levels (ASEL), and integrate ASCN and ASEL profiles to identify functional genomic variants. 2) TransBreak will expand our well-established k- mer-based assembly approach (novoBreak, Nature Methods 2016) to detect novel transcriptomic junctions and variants in the tumor WTS data and predict neo-antigens from assembled novel RNA isoforms. The output of these tools will be thoroughly evaluated using both computational and experimental means through consortia such as TCGA and bench/clinical collaborators using established protocols and resources. The proposed tools will be developed following best software engineering practices and will be released in open source via publicly available websites.
This proposal will develop two novel bioinformatics tools that compute tumor purity, intratumor heterogeneity, allele-specific copy numbers and allele-specific expression levels in the tumor compartment of bulk whole exome DNA and whole transcriptome RNA sequencing data from patient matched tissues to identify functional mutations and neoantigens for cancer molecular diagnosis, targeted therapy and immunotherapy.