The Cancer Genome Atlas Data Analysis Center

Chin, Lynda; Getz, Gad

Abstract

The overarching goal of TCGA is to change the practice of cancer nnedicine and improve patient survival through cancer genomics. A key deliverable is to enable access and use of complex multi- dimensional genomic data for downstream studies. We propose to operate a GDAC-A center with the leadership, expertise and infrastructure required to develop an analysis pipeline that will generate pre- defined integrative analyses and interpretations that are tailor-designed for hypothesis-testing by basic, translational and clinical investigators. Our team consists of experts in cancer biology, genomics and bioinformatics with a track record of leadership in TCGA. The analytical tools and pipeline structure are based on our extensive TCGA experiences and designed to optimally achieve its goals. This pipeline will be built using the GenePattern bioinformatic workflow environment - a flexible and modular architecture that is caBIG and caGRID compliant, maintained in the well-established, robust and secure IT infrastructure at the Broad Institute and can be operated 24/7 as a Production Pipeline. Leveraging this well-established resource, we will pursue the following specific aims.
Aim 1. We will define caBIG compliant data format for all input and output files. To further enhance standardization, we propose two additions to the standard data structure defined in the Pilot Project (Levels 1-4). Level 0 will define specific versions of all reference databases used in the analyses and Level 5 will capture disease-level findings that incorporate prior knowledge.
Aim 2. We will design analysis modules to consolidate data from all components of TCGA and to perform integrative analyses. Results will be submitted to DCC in caBIG compliant output files accompanied by human-readable reports containing text summaries, tables and figures in a format understandable to scientists of diverse disciplines, similar to the Results Section of a publication. In addition, we are committed to continuous technical and analytical improvement of the pipeline, particularly in supporting the transition to next-generation sequencing platforms.
Aim 3. We will implement this high-throughput analysis pipeline in an industrial-level production mode with rigorous quality control, leveraging the Broad's infrastructural support and extensive experiences in running and maintaining high-throughput computational pipelines.

Public Health Relevance

This GDAC-A center will deliver, in a high-throughput and reliable manner, integrative analyses of TCGA data to bridge the gap between TCGA data generation and their use in biomedical research and eventual translation into the clinic. This is a key deliverable of TCGA, thus this effort is highly relevant.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Cancer Institute (NCI)
Type: Resource-Related Research Projects--Cooperative Agreements (U24)
Project #: 3U24CA143845-05S1
Application #: 8925187
Study Section: Special Emphasis Panel (ZCA1-SRLB-U (O1))
Program Officer: Yang, Liming

Project Start: 2009-09-29
Project End: 2015-07-31
Budget Start: 2013-08-01
Budget End: 2015-07-31
Support Year: 5
Fiscal Year: 2014
Total Cost: $1,041,941
Indirect Cost: $420,472

Institution

Name: Broad Institute, Inc.
Department
Type
DUNS #: 623544785

City: Cambridge
State: MA
Country: United States
Zip Code: 02142

Related projects


NIH 2016 U24 CA	The Cancer Genome Atlas Data Analysis Center Getz, Gad; Chin, Lynda / Broad Institute, Inc.
NIH 2014 U24 CA	The Cancer Genome Atlas Data Analysis Center Chin, Lynda; Getz, Gad / Broad Institute, Inc.	$1,041,941
NIH 2013 U24 CA	The Cancer Genome Atlas Data Analysis Center Chin, Lynda; Getz, Gad / Broad Institute, Inc.	$3,167,900
NIH 2012 U24 CA	The Cancer Genome Atlas Data Analysis Center Chin, Lynda; Getz, Gad / Broad Institute, Inc.	$2,493,796
NIH 2011 U24 CA	The Cancer Genome Atlas Data Analysis Center Chin, Lynda; Getz, Gad / Broad Institute, Inc.	$2,535,444
NIH 2010 U24 CA	The Cancer Genome Atlas Data Analysis Center Chin, Lynda; Getz, Gad / Broad Institute, Inc.	$2,582,202
NIH 2010 U24 CA	The Cancer Genome Atlas Data Analysis Center Chin, Lynda; Getz, Gad / Broad Institute, Inc.	$693,768
NIH 2009 U24 CA	The Cancer Genome Atlas Data Analysis Center Chin, Lynda; Getz, Gad / Broad Institute, Inc.	$2,664,889

Publications

Sanchez-Vega, Francisco; Mina, Marco; Armenia, Joshua et al. (2018) Oncogenic Signaling Pathways in The Cancer Genome Atlas. Cell 173:321-337.e10

Way, Gregory P; Sanchez-Vega, Francisco; La, Konnor et al. (2018) Machine Learning Detects Pan-cancer Ras Pathway Activation in The Cancer Genome Atlas. Cell Rep 23:172-180.e3

Ricketts, Christopher J; De Cubas, Aguirre A; Fan, Huihui et al. (2018) The Cancer Genome Atlas Comprehensive Molecular Characterization of Renal Cell Carcinoma. Cell Rep 23:313-326.e5

Knijnenburg, Theo A; Wang, Linghua; Zimmermann, Michael T et al. (2018) Genomic and Molecular Landscape of DNA Damage Repair Deficiency across The Cancer Genome Atlas. Cell Rep 23:239-254.e6

Peng, Xinxin; Chen, Zhongyuan; Farshidfar, Farshad et al. (2018) Molecular Characterization and Clinical Relevance of Metabolic Expression Subtypes in Human Cancers. Cell Rep 23:255-269.e4

Huang, Kuan-Lin; Mashl, R Jay; Wu, Yige et al. (2018) Pathogenic Germline Variants in 10,389 Adult Cancers. Cell 173:355-370.e14

Ding, Li; Bailey, Matthew H; Porta-Pardo, Eduard et al. (2018) Perspective on Oncogenic Processes at the End of the Beginning of Cancer Genomics. Cell 173:305-320.e10

Seiler, Michael; Peng, Shouyong; Agrawal, Anant A et al. (2018) Somatic Mutational Landscape of Splicing Factor Genes and Their Functional Consequences across 33 Cancer Types. Cell Rep 23:282-296.e4

Jayasinghe, Reyka G; Cao, Song; Gao, Qingsong et al. (2018) Systematic Analysis of Splice-Site-Creating Mutations in Cancer. Cell Rep 23:270-281.e3

Saltz, Joel; Gupta, Rajarsi; Hou, Le et al. (2018) Spatial Organization and Molecular Correlation of Tumor-Infiltrating Lymphocytes Using Deep Learning on Pathology Images. Cell Rep 23:181-193.e7

Showing the most recent 10 out of 87 publications

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: