The Buetow laboratory has developed analytical tools for large-scale, multi-dimensional cancer genome data. We have developed sensitive mutation detection methods (SNPdetector and IndelDetector) for identifying somatic mutations in tumor tissues;we developed the Cancer Genome Workbench (CGWB), a visualization tool that integrates somatic mutation data with copy number alteration, gene expression, methylation and microRNA expression. CGWB has integrated data from TCGA, TARGET, the Sanger Center COSMIC initiative, NHGRI Tumor Sequencing Project (TSP), whole genome somatic mutation data from the Vogelstein laboratory at Johns Hopkins University and GlaxoSmithKline Cancer Cell Line Genomic Profiling Data, thereby facilitating the integration and interpretation of diverse, high-quality raw data for the entire cancer research community. The Buetow laboratory is an active member of the TCGA consortium and has contributed analysis to the GBM flagship publication. For example, the laboratory developed the first computational pipeline for mutation detection and brought the novel NF1 mutation finding to the consortium for validation. Since this publication, an additional collection of GBM samples and genes have been sequenced. These extended data find 130 validated somatic mutations and 28 putative somatic mutations. With the exception of ERBB2 there is no significant difference between the somatic mutations reported in the original samples and those in the additional samples among the original 8 genes (i.e. TP53, EGFR, PTEN, NF1, ERBB2, RB1, PIK3R1 and PIK3CA). No somatic mutations were found in the additional 89 samples for ERBB2. Interestingly, 7 out of 10 samples from the original 91 with ERBB2 mutations were in secondary or recurrent GBM samples that were treated with neo-adjuvant chemotherapy and radiation. The results suggest that the ERBB2 mutations may arise only in secondary GBM. The TCGA consortia are still preparing its flagship publication. To date, the Buetow laboratory contribution includes the initial computational identification of somatic mutations, somatic allele loss, and discovery of fusion gene products. Our analysis of the Sanger sequencing data first identified that all TCGA ovarian cancer samples have TP53 somatic mutations. The 100% TP53 mutation rate coupled with the gross somatic copy number alteration suggested that these tumor samples are high-grade cyst carcinomas, a finding which was later confirmed when pathology data became available. We brought to the attention of the consortia truncation mutations in BRCA1 and BRCA2. Somatic and germline truncation mutations were found in approximately 25% of the patients. By examining the integrated view of sequencing, copy number alterations in these samples, we also discovered that LOH at the BRCA1 locus was found in all samples (100%);most of which appeared to be caused by copy-neutral LOH. 50% of the samples have LOH at the BRCA2 locus including one sample that has a reversion from mutation to wild-type: its germline frame shift mutant allele was lost by LOH. These findings were later verified by examination of next-gen sequencing data. In addition to TP53 and BRCA1/2, we have also detected high-frequency mutations in TTN and MUC16. The sample that has the BRCA2 reversion mutation appears to have fusion proteins that arise from amplification and deletion break-points occurring in two fusion partner. The Buetow laboratory is responsible for analyzing mutations for childhood acute lymphoblastic leukemia TARGET project. The laboratory has identified novel recurrent activating mutations in the Janus kinases JAK1 (n = 3), JAK2 (n = 16), and JAK3 (n = 1) in 20 (10.7%) of 187 BCR-ABL1-negative, high-risk pediatric ALL cases. The JAK1 and JAK2 mutations involved highly conserved residues in the kinase and pseudokinase domains and resulted in constitutive JAK-STAT activation and growth factor independence of Ba/F3-EpoR cells. The presence of JAK mutations was significantly associated with alteration of IKZF1 (70% of all JAK-mutated cases and 87.5% of cases with JAK2 mutations;p-value = 0.001) and deletion of CDKN2A/B (70% of all JAK-mutated cases and 68.9% of JAK2-mutated cases). The JAK-mutated cases had a gene expression signature similar to that of BCR-ABL1 in pediatric ALL, and they had a poor outcome. These results suggest that inhibition of JAK signaling is a logical target for therapeutic intervention. The COG Phase 1 Consortium is now developing a trial, expected to start in 2010, of a JAK inhibitor. Systematic analysis of all 185 validated somatic mutations identified four key pathways that are highly mutated in ALL: RAS signaling (39%), JAK signaling (10%), p53/RB signaling (6%), and B-cell development (14%). The RAS signaling pathway has the highest mutation rate and the mutations are over-represented in two gene expression clusters that correspond to patients with good clinical outcome and those with no sentinel cytogenetic lesions (p value <0.0001). In contrast, no mutation has been found in the PI-3K pathway. In addition to studying primary tumors, the Buetow laboratory has analyzed the gene expression profile for the NCI60 cell lines in a time-course drug induction experiment carried out by DCTD DTP Drs. Anne Monk and Jim Doroshow. The NCI60 represent the front end of the NCI's drug development platform in which candidate agents are tested for effects on cell growth inhibition and death. Many thousands of compounds have now been screened utilizing this panel. Multidimensional molecular characterizations of the constitutional state of the multi-cancer type panel have been performed, including candidate gene mutation analysis, copy number assessment, and gene expression profiles. We have analyzed gene expression profiles after 2hr, 6hr and 24hr exposure to the following five drugs using both a low and a high dose regimen: doxorubicin, bortezomib, dastinib, taxol and sunitinib. The global gene expression changes induced by drug exposure are quite different. Soxorubicin: low dose patterns appear one time interval behind the high-dose treatment patterns;Bortezomib: massive expression changes occur at early time points and there is no major difference between low and high dose. Taxol low-dose and high-dose have similar effect. Dasatinib: triangle pattern indicating sensitive cell lines have expression changes early on while the resistant cell lines have no change all the way through. Sunitinib: only high-dose treatment has resulted in dramatic expression changes. To identify gene expression signatures related to drug sensitivity, we are: a) using GLM models to calculate expression correlation to GI50;b) identifying pathways with over-representation of induced or reduced genes using Fisher Ominibus test;c) applying PathOlogist analysis before and after the treatment to identify significantly altered pathways;and d) performing gene-rank analysis to evaluate similarity in drug response profile by drug, dose, time and tissue. BTG Cell cycle p53 Induced at 2hr Induced at 6hr Induced at 24hr Reduced at 24hr For comparison of cross-drug response (d) we used the Komogorov-Smirnov test to compare the top 200 most up-regulated and most down-regulated genes after drug treatment. Of the five drugs we analyzed, we see strong correlation of the profiles of the same drug treatment. However, cross-drug analysis reveals that dasatinib and sunitinib show similar profiles which is consistent with expectations as both drugs are kinase inhibitors. Response to a drug can vary from one cell line to the other and appears to be influenced by the genomic composition. For example, all 11 cell lines that have p53 wild-type, p16 deletion and MDR1-negative are sensitive to doxorubicin.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Scientific Computing Intramural Research (ZIH)
Project #
Application #
Study Section
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
National Cancer Institute Division of Basic Sciences
Zip Code
Edmonson, Michael N; Zhang, Jinghui; Yan, Chunhua et al. (2011) Bambino: a variant detector and alignment viewer for next-generation sequencing data in the SAM/BAM format. Bioinformatics 27:865-6
Greenblum, Sharon I; Efroni, Sol; Schaefer, Carl F et al. (2011) The PathOlogist: an automated tool for pathway-centric analysis. BMC Bioinformatics 12:133
Efroni, Sol; Ben-Hamo, Rotem; Edmonson, Michael et al. (2011) Detecting cancer gene networks characterized by recurrent genomic alterations in a population. PLoS One 6:e14437
Clifford, Robert J; Zhang, Jinghui; Meerzaman, Daoud M et al. (2010) Genetic variations at loci involved in the immune response are risk factors for hepatocellular carcinoma. Hepatology 52:2034-43
Buetow, Kenneth H (2009) An infrastructure for interconnecting research institutions. Drug Discov Today 14:605-10
Buetow, Kenneth H; Niederhuber, John (2009) Infrastructure for a learning health care system: CaBIG. Health Aff (Millwood) 28:923-4; author reply 924-5
Radtke, Ina; Mullighan, Charles G; Ishii, Masami et al. (2009) Genomic analysis reveals few genetic alterations in pediatric acute myeloid leukemia. Proc Natl Acad Sci U S A 106:12944-9
Schaefer, Carl F; Anthony, Kira; Krupa, Shiva et al. (2009) PID: the Pathway Interaction Database. Nucleic Acids Res 37:D674-9
Mullighan, Charles G; Zhang, Jinghui; Harvey, Richard C et al. (2009) JAK mutations in high-risk childhood acute lymphoblastic leukemia. Proc Natl Acad Sci U S A 106:9414-8
Mullighan, Charles G; Su, Xiaoping; Zhang, Jinghui et al. (2009) Deletion of IKZF1 and prognosis in acute lymphoblastic leukemia. N Engl J Med 360:470-80

Showing the most recent 10 out of 14 publications