MeV: Software for Next Generation Genomic Data Analysis

Quackenbush, John

Abstract

The most transformative aspect of the Human Genome Project was not the sequence of the genome itself, but the technologies that the project has spawned. Genomics, functional genomics, proteomics, metabolomics, systems biology, and other research areas are fundamentally driven by new technologies. The 1995 publication of methods microarray gene expression analysis launched a revolution in biological inquiry in which reliable technologies changed the scale at which we were able to generate data from biological systems and started the transformation of biomedical research from a purely laboratory science to an information science. One of the greatest challenges in this information-rich era of biomedical science has been our ability to effectively collect, manage, and analyze the data that even small laboratories now regularly produce. As early adopters of DNA microarray technology, our group combined laboratory inquiry with software development to address biological questions. By applying state-of-the-art statistical and data mining approaches, we created a series of open-source, easy-to-use software tools that provide access to these methods in a manner that allows even those with limited bioinformatics experience to effectively explore their data and reach testable hypotheses regarding the underlying biology. Of these, MeV, has become one of the most widely used software tools for the analysis DNA microarray data, with more than 22,500 downloads in the past calendar year and more than 1,233 total citations-statistics we believe underestimate the system's overall use. MeV's development and maintenance during the past 10 years has been supported by a series of research grants and other sources. Here we propose to further develop and maintain the codebase, expanding the utility and functionality of the software to meet the challenges of genomic analysis in the coming years. The greatest challenge in keeping MeV and similar tools relevant is the onslaught of data arising from the new sequencing technologies that are poised to replace array-based analysis in many applications. These tools will open up new avenues of investigation by placing genomic analysis on a footing equal with microarrays, providing an opportunity for new approaches to integrative genomic data analysis. In this application, we will describe our plans to expand the utility of MeV by incorporating an ever-increasing number of public domain tools through integration with Cytoscape and Bioconductor, expand the number of novel algorithms developed through our work and development, extend the software and tools to work with Next Generation sequence- based genomic assays, and provide support to the growing community of users of the software and tools that exist. In doing so, we hope to advance understanding of a wide range of diseases, provide resources for interpreting data from such projects as The Cancer Genome Atlas (TCGA) and the Lung Genomics Research Consortium (LGRC), and enable and accelerate research far beyond that of our own.

Public Health Relevance

Project Narrative: MeV is a freely available, open-source software system that provides cutting-edge data analysis tools that have helped research scientists analyze and publish results from genomic studies. With more than 22,500 downloads in the past 12 months, MeV is among the world's most widely used software systems for genomic data analysis. We are requesting funds to continue to support MeV as a resource for the community and to add functionality to the software to enable it to handle the vast quantity of complex data that Next Generation DNA sequencing technologies are beginning to provide.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Cancer Institute (NCI)
Type: Research Project--Cooperative Agreements (U01)
Project #: 1U01CA151118-01A1
Application #: 8148006
Study Section: Special Emphasis Panel (ZRG1-BST-E (03))
Program Officer: Li, Jerry

Project Start: 2011-09-07
Project End: 2016-08-31
Budget Start: 2011-09-07
Budget End: 2012-08-31
Support Year: 1
Fiscal Year: 2011
Total Cost: $608,383
Indirect Cost

Institution

Name: Dana-Farber Cancer Institute
Department
Type
DUNS #: 076580745

City: Boston
State: MA
Country: United States
Zip Code: 02215

Related projects


NIH 2015 U01 CA	MeV: Software for Next Generation Genomic Data Analysis Quackenbush, John / Dana-Farber Cancer Institute
NIH 2014 U01 CA	MeV: Software for Next Generation Genomic Data Analysis Quackenbush, John / Dana-Farber Cancer Institute
NIH 2013 U01 CA	MeV: Software for Next Generation Genomic Data Analysis Quackenbush, John / Dana-Farber Cancer Institute	$539,962
NIH 2012 U01 CA	MeV: Software for Next Generation Genomic Data Analysis Quackenbush, John / Dana-Farber Cancer Institute	$606,717
NIH 2011 U01 CA	MeV: Software for Next Generation Genomic Data Analysis Quackenbush, John / Dana-Farber Cancer Institute	$608,383

Publications

Hosny, Ahmed; Parmar, Chintan; Quackenbush, John et al. (2018) Artificial intelligence in radiology. Nat Rev Cancer 18:500-510

Hicks, Stephanie C; Okrah, Kwame; Paulson, Joseph N et al. (2018) Smooth quantile normalization. Biostatistics 19:185-198

Parmar, Chintan; Barry, Joseph D; Hosny, Ahmed et al. (2018) Data Analysis Strategies in Medical Imaging. Clin Cancer Res 24:3492-3499

Kuijjer, Marieke Lydia; Paulson, Joseph Nathaniel; Salzman, Peter et al. (2018) Cancer subtype identification using somatic mutation data. Br J Cancer 118:1492-1501

Wang, Yaoyu E; Kutnetsov, Lev; Partensky, Antony et al. (2017) WebMeV: A Cloud Platform for Analyzing and Visualizing Cancer Genomic Data. Cancer Res 77:e11-e14

Domenyuk, Valeriy; Zhong, Zhenyu; Stark, Adam et al. (2017) Plasma Exosome Profiling of Cancer Patients by a Next Generation Systems Biology Approach. Sci Rep 7:42741

Kibbe, Warren; Klemm, Juli; Quackenbush, John (2017) Cancer Informatics: New Tools for a Data-Driven Age in Cancer Research. Cancer Res 77:e1-e2

Manimaran, Solaiappan; Selby, Heather Marie; Okrah, Kwame et al. (2016) BatchQC: interactive software for evaluating sample and batch effects in genomic data. Bioinformatics 32:3836-3838

Ferrari, Giovanni; Quackenbush, John; Strobeck, John et al. (2014) Comparative genome-wide transcriptional analysis of human left and right internal mammary arteries. Genomics 104:36-44

Schroder, Markus S; Gusenleitner, Daniel; Quackenbush, John et al. (2013) RamiGO: an R/Bioconductor package providing an AmiGO visualize interface. Bioinformatics 29:666-8

Showing the most recent 10 out of 11 publications

Comments

Be the first to comment on John Quackenbush's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: