In preparation for my retirement and appointment as Scientist Emeritus, which occurred on May 30, 2015, I have during the past year been working largely in collaboration with members of the Genomics and Bioinformatics Group (GBG) in our Laboratory. I used GBG's recently developed software to study gene expression correlations in the human cancer cell line data of the CCLE and the human cancer tissue data of the TCGA project (see Goals and Objectives). My focus has been on functional relationships among individual genes, in contrast to most of the ongoing studies of these databases in the GBG, which are largely statistical in nature. The TCGA database contains expression data for human tumor tissues, while the CCLE database contains expression data for cell lines derived from such tissues. In comparing those two databases, I found major differences in the expression patterns of particular cancer-related genes. This was particularly evident, for example, in genes related to the epithelial-mesenchymal transition. After further investigation and discussion in our GBG conferences, we concluded that, although stromal components in tissue samples may in some cases contribute to the difference, a major factor, at least in my view, is likely to be selection of cell types growing out in culture. Supporting the later possibility is my observation that the TCGA data are (surprisingly) more internally and functionally consistent than the cell line data. The epithelial-mesenchymal transition (EMT), which is thought to promote tumor invasion and metastasis, involves conversion of epithelial-like tumor cells to mesenchymal-like states with gene expression characteristics that in some ways may be similar to those in sarcomas. I therefore compared TCGA tissue data for colon adenocarcinomas with those for sarcomas. A notable finding was that KDM4D, a DNA lysine demethylase gene implicated in the repair of DNA double-strand breaks, was distinctly expressed more highly in sarcomas than in the colon adenocarcinomas. Inhibition of this gene, or of DNA repair genes having similar expression differences, might therefore increase sensitivity to DNA damaging agents of carcinoma cells that have undergone EMT. (Showing a similar expression difference was the SWI/SNF gene SMARCA1 that may also play a role in DNA repair.) A set of mutually highly correlated genes, for example in TCGA tissue samples from NSCLC relative to normal lung (r 0.90), consisted predominantly of genes related to mitotic spindle and centrosome function; these genes were distributed over several different chromosomes. This indicates that highly correlated gene expression in TCGA tissue samples is at least sometimes associated with functions in common. In some cases, on the other hand, high mutual expression correlation was observed in genes co-localized in the same chromosome region, which may be due in part to copy number peculiarities in those chromosome regions, or may reflect transcriptional accessibility in those regions. Perhaps favoring the latter possibilities in some cases is that the co-localization was sometimes markedly tissue-type dependent. For example, among the 100 genes most highly correlated with the expression of TDP1 in colon adenocarcinoma tissue samples, 80 were located in regions of chromosome 14q, whereas in the case of lung adenocarcinomas only 8 genes were located there. This localization at 14q was apparent also in breast cancer but not in normal breast tissue samples. In the CCLE cell lines derived from particular tumor tissue types, several genes functioning in cell proliferation control, DNA repair, or control of epithelial versus mesenchymal character exhibited a dichotomous expression pattern, such that most cell lines derived from the same type of tumor had either a distinctly high or distinctly low expression. We do not know whether these represent different tumor subtypes or whether many of those tumors contain mixtures of 2 or more cell types, one of which may grow to dominate the culture in a particular cell line. Many of the genes showing expression dichotomies are known to affect anticancer drug sensitivities, suggesting that they may be indicators of drug response or possibly new drug targets. The existence of a dichotomy in gene expression in cell lines derived from the same tumor type suggests that there is a discrete mechanism that determines the on-or-off transcription of that gene. A particularly interesting 2-way dichotomy was for MTMG and SLFN11 in lung cancer cell lines (both SCLC and NSCLC) wherein there were 4 distinct subtypes representing the 4 possible expression combination (high versus low). Of particular interest would be cell lines, or especially tumors, expressing low-MTMG and high-SLFN11 expression, which would suggest synergistic sensitivity to a combination of temozolomide and a topoisomerase inhibitor. Several genes showed distinctly elevated or reduced expression in cell lines derived from small cell lung cancers (SCLC) relative to other types of lung cancer or, in some cases, to other tumor types in general. Some genes showed a dichotomy, such that some SCLC lines showed distinctly high (or low) expression whereas others showed normal expression. Most SCLC cell lines were distinguished from almost all other cell lines in the CCLE database by high expression of ASCL1, as well as of MYCL; high expression of these 2 genes was associated with low expression of NOTCH2, which in turn was associated with high expression of DLL1, consistent with the reported inhibition of NOTCH by DLL1 in neighboring cells. SCLC cell lines exhibited distinctly high expression of INSM1, consistent with the neuroendocrine nature of these cell types. High expression of ASCL1 and INSM1 may be a marker for neuroendocrine tumor cell types, including SCLC; neuroendocrine marker genes NEUROD1 and CHGA may also be helpful, because their expression in SCLC was usually high even in those few lines that did not express high ASCL1. High expression of MYCL was almost always associated with low expression of MYC, and this relationship was prominent in many SCLC cell lines; cell lines presumably may use either form of Myc to support proliferation and do not need to express both forms. SCLC cell lines expressed distinctly reduced expression of genes related to the degradation of extracellular matrix during cell migration (gene of the network described in Kohn et al 2012 PLOS ONE); this finding is consistent with the proposal that SCLC cells, unlike most mesenchymal tumor cells, migrate in amoeboid fashion, which does not require extracellular matrix degradation. SCLC cell lines exhibited a dichotomy between high and low expression of MGMT whose product removes alkylations from DNA guanine-O6 positions; cells deficient in this gene product tend to be sensitive to alkylating agents such as temozolomide; thus some identifiable SCLC tumors may be sensitive to drugs of this type. Many SCLC cell lines showed an unusual expression pattern in the ephrin systems; for example high expression of EFNB3 coupled with low expression its receptor EPHB4. Another unusual expression pattern characteristic of SCLC cell lines was high expression of GADD45G coupled with low expression of GADD45B. (We unfortunately did not have ready access to adequate data for SCLC tissue samples.)

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Investigator-Initiated Intramural Research Projects (ZIA)
Project #
1ZIABC011662-01
Application #
9154024
Study Section
Project Start
Project End
Budget Start
Budget End
Support Year
1
Fiscal Year
2015
Total Cost
Indirect Cost
Name
Basic Sciences
Department
Type
DUNS #
City
State
Country
Zip Code