Cloud Based Resource for Data Hosting, Visualization and Analysis Using UCSC Canc

Haussler, David

Abstract

Cancer genomics resources are growing at an unprecedented pace. However, a comprehensive analysis of the cancer genome still remains a daunting challenge. This is in part due to the difficulties in visualizing, integrating, and analyzng cancer genomics data with current technologies. We propose to develop a cloud-based platform to empower researchers with the ability to host, visualize and analyze their own data. The platform is composed of a set of Cancer Analytics Virtual Machines (CAVMs). The main component of each CAVM is a data server which functions to store and serve user data to applications, such as the UCSC Cancer Genomics Browser, to provide data visualization. The second component is a modified Galaxy workflow system to provide data analysis capability. UCSC's suite of analysis tools for nextgen sequencing data analysis and pathway inference will be prepackaged with the system. The two components will be highly integrated to allow tightly coupled cycles of data visualization and analysis. The data server component will be modular such that it can provide data independently to applications besides the Cancer Browser and Galaxy. We will deliver virtual machine images that can be easily initiated in a commercial cloud such as Amazon, or can be installed within a user's own institution. The CAVM also functions as a way for users to Integrate with external large-scale databases. We will deliver a UCSC CAVM that other CAVM instances can connect to, to provide authorized data access from the UCSC cancer genomics data repository. The system allows the dynamic formation of new datasets composed of data slices from multiple sources. This ability to combine data into larger samples will provide the statistical power to allow discoveries that would otherwise not be possible.
We aim to eliminate, or significantly reduce, the overhead of system configuration and software installation. Our tools will provide users the capability to access a cloud-based cluster computing environment, which will make sophisticated, computationally intensive analyses accessible to researchers who might not, have access to compute servers. The software platform we develop can be used by individual bench biologists, and also by large projects to serve data to individual users or to other projects. This design has the potential to form an expansive federated database accessible through the same software interface.

Public Health Relevance

Currently, clinicians and bench biologists typically depend on external collaborators for data analysis. The proposed system will provide these scientists with data analysis and visualization methods that are both powerful and easy to use. This will accelerate research in the understanding and treatment of cancer, the second-leading cause of death in the U.S.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Cancer Institute (NCI)
Type: Resource-Related Research Projects--Cooperative Agreements (U24)
Project #: 5U24CA180951-02
Application #: 8735909
Study Section: Special Emphasis Panel (ZCA1)
Program Officer: Li, Jerry

Project Start: 2013-09-17
Project End: 2018-08-31
Budget Start: 2014-09-01
Budget End: 2015-08-31
Support Year: 2
Fiscal Year: 2014
Total Cost
Indirect Cost

Institution

Name: University of California Santa Cruz
Department: Engineering (All Types)
Type: Biomed Engr/Col Engr/Engr Sta
DUNS #

City: Santa Cruz
State: CA
Country: United States
Zip Code: 95064

Related projects


NIH 2017 U24 CA	Cloud Based Resource for Data Hosting, Visualization and Analysis Using UCSC Canc Haussler, David H. / University of California Santa Cruz
NIH 2016 U24 CA	Cloud Based Resource for Data Hosting, Visualization and Analysis Using UCSC Canc Haussler, David H. / University of California Santa Cruz
NIH 2015 U24 CA	Cloud Based Resource for Data Hosting, Visualization and Analysis Using UCSC Canc Haussler, David H. / University of California Santa Cruz
NIH 2014 U24 CA	Cloud Based Resource for Data Hosting, Visualization and Analysis Using UCSC Canc Haussler, David H. / University of California Santa Cruz
NIH 2013 U24 CA	Cloud Based Resource for Data Hosting, Visualization and Analysis Using UCSC Canc Haussler, David H. / University of California Santa Cruz	$621,154

Publications

Vivian, John; Rao, Arjun Arkal; Nothaft, Frank Austin et al. (2017) Toil enables reproducible, open source, big biomedical data analyses. Nat Biotechnol 35:314-316

Fishbein, Lauren; Leshchiner, Ignaty; Walter, Vonn et al. (2017) Comprehensive Molecular Characterization of Pheochromocytoma and Paraganglioma. Cancer Cell 31:181-193

Cherniack, Andrew D; Shen, Hui; Walter, Vonn et al. (2017) Integrated Molecular Characterization of Uterine Carcinosarcoma. Cancer Cell 31:411-423

Cancer Genome Atlas Research Network; Albert Einstein College of Medicine; Analytical Biological Services et al. (2017) Integrated genomic and molecular characterization of cervical cancer. Nature 543:378-384

Cancer Genome Atlas Research Network; Linehan, W Marston; Spellman, Paul T et al. (2016) Comprehensive Molecular Characterization of Papillary Renal-Cell Carcinoma. N Engl J Med 374:135-45

Speir, Matthew L; Zweig, Ann S; Rosenbloom, Kate R et al. (2016) The UCSC Genome Browser database: 2016 update. Nucleic Acids Res 44:D717-25

Ceccarelli, Michele; Barthel, Floris P; Malta, Tathiane M et al. (2016) Molecular Profiling Reveals Biologically Discrete Subsets and Pathways of Progression in Diffuse Glioma. Cell 164:550-63

Blau, C Anthony; Ramirez, Arturo B; Blau, Sibel et al. (2016) A Distributed Network for Intensive Longitudinal Monitoring in Metastatic Triple-Negative Breast Cancer. J Natl Compr Canc Netw 14:8-17

Zheng, Siyuan; Cherniack, Andrew D; Dewal, Ninad et al. (2016) Comprehensive Pan-Genomic Characterization of Adrenocortical Carcinoma. Cancer Cell 29:723-736

Cancer Genome Atlas Network (2015) Comprehensive genomic characterization of head and neck squamous cell carcinomas. Nature 517:576-82

Showing the most recent 10 out of 22 publications

Comments

Be the first to comment on David Haussler's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: