Cancer immunotherapy has achieved remarkable clinical success treating late-stage tumors, yet the response rates remain low and the side effects are often severe. Designing effective immunotherapies relies on accurate identification of tumor-reactive T cells. This is an extremely difficult task because 1) most of the cancer antigens are unknown; 2) the majority of the tumor-infiltrating T cells (TIL) does not recognize cancer cells; and 3) without known antigens, the only approach to acquire such T cells is to perform ex vivo expansion of TILs stimulated by autologous cancer cells, which generates non-specific T cells and is infeasible to many patients. Nonetheless, this strategy is widely adopted in current clinical trials for anti-cancer treatment, despite its reduced therapeutic efficacy and unpredictable side effects of autoimmunity. Therefore, unbiased, antigen- independent identification of tumor-reactive T cells, if possible, will be a major clinical priority as it will significantly increase the efficiency and safety of T cell based immunotherapies. Here we propose to achieve this goal through the development of novel machine learning methods. Such approach has not yet been explored because the fundamental difference between cancer and non-cancer T cells lies in their receptor sequences (TCR), and training data of cancer-specific TCRs is currently unavailable. To prepare for this task, we have developed the software TRUST, to extract the T cell antigen-binding CDR3 regions from bulk tumor RNA-seq data, and the software iSMART to group these CDR3s into antigen-specific clusters. These tools allowed us to develop a new rationale for producing large training sets of tumor-reactive TCRs, even without knowing cancer antigens. In our preliminary analysis, we observed that TCRs from the training data can be matched to tumor antigens that bind to HLA-A*02:01 and elicit immune response in vivo. The cancer-specific CDR3 amino acid sequences also show significantly different biochemical features from non-cancer ones, based on which we further developed software DeepCAT to demonstrate the feasibility of de novo prediction of cancer TCRs. These exciting results highlighted the importance to develop better computational method to track the tumor-reactive T cells for clinical applications. Accordingly, we propose the following Specific Aims:
In Aim 1, we will deliver a new machine learning method for accurate classification of tumor-reactive T cells using the CDR3 sequences.
In Aim 2, we will derive a set of biomarkers for the cancer-specific T cells for fast and accurate flow sorting of these T cells from TILs.
In Aim 3, we will perform single cell sequencing and functional validation of cancer-specific T cells using humanized animal model to validate the predicted genes, and to produce a prioritized list of promising targets for cancer diagnosis, prognosis and therapy development.
These Aims will be accomplished with the great support from the excellent collaborators specialized in cancer immunology at UTSW. Successful completion of this proposal will provide an exciting new paradigm to identify tumor-reactive T cells for precision cancer immunotherapies.

Public Health Relevance

Identification of tumor-specific T cells is critical to immunotherapy development, yet remains a challenging task. In this project we will develop a novel machine learning method to study: 1) which T cells in the tumor microenvironment are reactive to malignant cells; 2) what signature genes can be used to track the cancer- specific T cells. Outcomes from this project are expected to improve the efficacy and precision of cancer immunotherapies.

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Research Project (R01)
Project #
1R01CA245318-01A1
Application #
10051274
Study Section
Cancer Biomarkers Study Section (CBSS)
Program Officer
Dey, Sumana Mukherjee
Project Start
2020-09-01
Project End
2025-05-31
Budget Start
2020-09-01
Budget End
2021-05-31
Support Year
1
Fiscal Year
2020
Total Cost
Indirect Cost
Name
University of Texas Sw Medical Center Dallas
Department
Type
DUNS #
800771545
City
Dallas
State
TX
Country
United States
Zip Code
75390