The proposed Genome Data Analysis Center B (GDAC B) will work cooperatively with other GDACs funded by The Cancer Genome Atlas (TCGA) project to (i) develop an innovative, integrative pipeline for systems- level analysis of TCGA's molecular profiling data on many different types of human tumors and (ii) apply that pipeline and its component modules to TCGA data to address important biological and clinical questions. An overarching goal is to 'personalize'the management of patients'cancers on the basis of new tumor biomarkers and biosignatures. For the first time, it is easier to generate millions of data points on tumors than to analyze or interpret those data, hence the bioinformatic challenge is formidable. The pipeline will be constructed using the Agile software development paradigm and semantic web query architecture. It will be based on novel algorithms and modules developed by participants in the GDAC. Included will be modules for data integration, data visualization, pathway analysis, and systems biological interpretation, all designed to be user-friendly for the bench researcher and clinician. Those modules will be interfaced with additional ones developed by other GDACs, All development will adhere to standards of TCGA and the Cancer Biomedical Informatics Grid (caBIG) and will provide controlled access to ensure confidentiality of personally identifiable data. The proposed GDAC team brings to this project expertise in bioinformatics, biostatistics, software engineering, high-throughput molecular profiling technologies, systems-oriented biology, biomarker studies, pathology, and clinical research. The three co-PIs (for bioinformatics, systems biology, and clinical research) have each participated actively in TCGA since its inception, as have other members of the team, including the lead software engineer. A major strength is the University of Texas M. D. Anderson Cancer Center (MDACC) as an institution. MDACC has been, and presumably will continue to be, the largest source of tumor specimens for TCGA. As one of the country's foremost cancer centers, with by far the largest cancer clinical research program, MDACC has unparalleled expertise for follow up on medically important leads that result from the development and application of the pipeline to TCGA data.

Public Health Relevance

The Cancer Genome Atlas project will generate multi-faceted molecular profiles on 25 different human cancer types. The result will be a treasure trove of information that can be used to personalize cancer diagnosis and treatment. Analysis of the data is a bottleneck, which the proposed Genome Data Analysis Center will alleviate by building an innovative, advanced bioinformatic analysis pipeline.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Resource-Related Research Projects--Cooperative Agreements (U24)
Project #
Application #
Study Section
Special Emphasis Panel (ZCA1-SRLB-U (O1))
Program Officer
Yang, Liming
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Texas MD Anderson Cancer Center
Biostatistics & Other Math Sci
Other Domestic Higher Education
United States
Zip Code
Yang, Ji-Yeon; Werner, Henrica M J; Li, Jie et al. (2016) Integrative Protein-Based Prognostic Model for Early-Stage Endometrioid Endometrial Cancer. Clin Cancer Res 22:513-23
Camargo, M Constanza; Bowlby, Reanne; Chu, Andy et al. (2016) Validation and calibration of next-generation sequencing to identify Epstein-Barr virus-positive gastric cancer in The Cancer Genome Atlas. Gastric Cancer 19:676-81
Campbell, Joshua D; Alexandrov, Anton; Kim, Jaegil et al. (2016) Distinct patterns of somatic genome alterations in lung adenocarcinomas and squamous cell carcinomas. Nat Genet 48:607-16
Ceccarelli, Michele; Barthel, Floris P; Malta, Tathiane M et al. (2016) Molecular Profiling Reveals Biologically Discrete Subsets and Pathways of Progression in Diffuse Glioma. Cell 164:550-63
Chang, Hae Ryung; Nam, Seungyoon; Kook, Myeong-Cherl et al. (2016) HNF4α is a therapeutic target that links AMPK to WNT signalling in early-stage gastric cancer. Gut 65:19-32
Ryan, Michael; Wong, Wing Chung; Brown, Robert et al. (2016) TCGASpliceSeq a compendium of alternative mRNA splicing in cancer. Nucleic Acids Res 44:D1018-22
Zhang, Liangcai; Yuan, Ying; Lu, Karen H et al. (2016) Identification of recurrent focal copy number variations and their putative targeted driver genes in ovarian cancer. BMC Bioinformatics 17:222
Fan, Yu; Xi, Liu; Hughes, Daniel S T et al. (2016) MuSE: accounting for tumor heterogeneity using a sample-specific error model improves sensitivity and specificity in mutation calling from sequencing data. Genome Biol 17:178
Lu, Yiling; Ling, Shiyun; Hegde, Apurva M et al. (2016) Using reverse-phase protein arrays as pharmacodynamic assays for functional proteomics, biomarker discovery, and drug development in cancer. Semin Oncol 43:476-83
Cancer Genome Atlas Research Network; Linehan, W Marston; Spellman, Paul T et al. (2016) Comprehensive Molecular Characterization of Papillary Renal-Cell Carcinoma. N Engl J Med 374:135-45

Showing the most recent 10 out of 108 publications