Cancer Genomics:Integrative and Scalable Solutions in R / Bioconductor

Morgan, Martin

Abstract

This proposal develops scalable R / Bioconductor software infrastructure and data resources to integrate complex, heterogeneous, and large cancer genomic experiments. The falling cost of genomic assays facilitates collection of multiple data types (e.g., gene and transcript expression, structural variation, copy number, methylation, and microRNA data) from a set of clinical specimens. Furthermore, substantial resources are now available from large consortium activities like The Cancer Genome Atlas (TCGA). Existing analysis pipelines focus on the treatment of a specific data type, leaving a critical need for tool for integrative analysis of multiple genomic assays for locally generated or publicly available data. R / Bioconductor has historically provided standardized genomic data structures and annotations that have enjoyed widespread adoption in the cancer genomics research community. This proposal adapts R / Bioconductor to meet the increasing conceptual and computational complexity of multi-assay cancer genomic experiments. We begin by developing software containers for coordinated representation, manipulation, and transformation of heterogeneous derived data from multiple cancer genomic assays. These containers are then extended to manage very large primary data resources. To facilitate integration of local experimental results with major public cancer genomics experiment data sets and annotations, we re-package public resources and provide software and cloud-based facilities for easy and fast programmatic access from within R/Bioconductor. This greatly simplifies cancer genomic analysis tasks that otherwise require significant, error-prone individual efforts. Finally, we provide software infrastructure to enable high-throughput computation using parallel and iterative approaches. The ability to manipulate multi-assay cancer genomic experiments, to understand individual experimental results in the context of public experiments and annotations, and facilities for improved high-throughput computational performance in a well-established computing environment greatly enhances opportunities for analysis and comprehension of large multi-assay cancer genomic experiments.

Public Health Relevance

Researchers collect diverse types of complex genetic information about factors that contribute to cancer. This proposal helps researchers manage and analyze this information using advanced computational and statistical approaches.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Cancer Institute (NCI)
Type: Resource-Related Research Projects--Cooperative Agreements (U24)
Project #: 5U24CA180996-06
Application #: 9544049
Study Section: Special Emphasis Panel (ZCA1)
Program Officer: Chen, Huann-Sheng

Project Start: 2014-09-01
Project End: 2019-08-31
Budget Start: 2018-09-01
Budget End: 2019-08-31
Support Year: 6
Fiscal Year: 2018
Total Cost
Indirect Cost

Institution

Name: Roswell Park Cancer Institute Corp
Department
Type
DUNS #: 824771034

City: Buffalo
State: NY
Country: United States
Zip Code: 14263

Related projects


NIH 2020 U24 CA	Cancer Genomics: Integrative and Salable Solutions in R/Bioconductor Morgan, Martin T.; Waldron, Levi David / Roswell Park Cancer Institute Corp
NIH 2019 U24 CA	Cancer Genomics: Integrative and Salable Solutions in R/Bioconductor Morgan, Martin T.; Waldron, Levi David / Roswell Park Cancer Institute Corp
NIH 2018 U24 CA	Cancer Genomics:Integrative and Scalable Solutions in R / Bioconductor Morgan, Martin T. / Roswell Park Cancer Institute Corp
NIH 2017 U24 CA	Cancer Genomics:Integrative and Scalable Solutions in R / Bioconductor Morgan, Martin T. / Roswell Park Cancer Institute Corp
NIH 2017 U24 CA	A Bioconductor Software Package for LISH-seq Probe Design and Data Analysis (1 of 2) Morgan, Martin T. / Roswell Park Cancer Institute Corp
NIH 2016 U24 CA	Cancer Genomics:Integrative and Scalable Solutions in R / Bioconductor Morgan, Martin T. / Roswell Park Cancer Institute Corp
NIH 2015 U24 CA	Cancer Genomics:Integrative and Scalable Solutions in R / Bioconductor Morgan, Martin T. / Roswell Park Cancer Institute Corp	$696,669
NIH 2014 U24 CA	Cancer Genomics:Integrative and Scalable Solutions in R / Bioconductor Morgan, Martin / Fred Hutchinson Cancer Research Center

Publications

Ma, Siyuan; Ogino, Shuji; Parsana, Princy et al. (2018) Continuity of transcriptomes among colorectal cancer subtypes based on meta-analysis. Genome Biol 19:142

Chen, Gregory M; Kannan, Lavanya; Geistlinger, Ludwig et al. (2018) Consensus on Molecular Subtypes of High-Grade Serous Ovarian Carcinoma. Clin Cancer Res 24:5037-5047

Waldron, Levi (2018) Data and Statistical Methods To Analyze the Human Microbiome. mSystems 3:

Myint, Leslie; Kleensang, Andre; Zhao, Liang et al. (2017) Joint Bounding of Peaks Across Samples Improves Differential Analysis in Mass Spectrometry-Based Metabolomics. Anal Chem 89:3517-3523

Fortin, Jean-Philippe; Triche Jr, Timothy J; Hansen, Kasper D (2017) Preprocessing, normalization and integration of the Illumina HumanMethylationEPIC array with minfi. Bioinformatics 33:558-560

Pasolli, Edoardo; Schiffer, Lucas; Manghi, Paolo et al. (2017) Accessible, curated metagenomic data through ExperimentHub. Nat Methods 14:1023-1024

Quiroz-Zárate, Alejandro; Harshfield, Benjamin J; Hu, Rong et al. (2017) Expression Quantitative Trait loci (QTL) in tumor adjacent normal breast tissue and breast tumor tissue. PLoS One 12:e0170181

Ramos, Marcel; Schiffer, Lucas; Re, Angela et al. (2017) Software for the Integration of Multiomics Experiments in Bioconductor. Cancer Res 77:e39-e42

Kannan, Lavanya; Ramos, Marcel; Re, Angela et al. (2016) Public data and open source tools for multi-assay genomic investigation of disease. Brief Bioinform 17:603-15

Spratt, Daniel E; Chan, Tiffany; Waldron, Levi et al. (2016) Racial/Ethnic Disparities in Genomic Sequencing. JAMA Oncol 2:1070-4

Showing the most recent 10 out of 12 publications

Comments

Be the first to comment on Martin Morgan's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: