The most transformative aspect of the Human Genome Project was not the sequence of the genome itself, but the technologies that the project has spawned. Genomics, functional genomics, proteomics, metabolomics, systems biology, and other research areas are fundamentally driven by new technologies. The 1995 publication of methods microarray gene expression analysis launched a revolution in biological inquiry in which reliable technologies changed the scale at which we were able to generate data from biological systems and started the transformation of biomedical research from a purely laboratory science to an information science. One of the greatest challenges in this information-rich era of biomedical science has been our ability to effectively collect, manage, and analyze the data that even small laboratories now regularly produce. As early adopters of DNA microarray technology, our group combined laboratory inquiry with software development to address biological questions. By applying state-of-the-art statistical and data mining approaches, we created a series of open-source, easy-to-use software tools that provide access to these methods in a manner that allows even those with limited bioinformatics experience to effectively explore their data and reach testable hypotheses regarding the underlying biology. Of these, MeV, has become one of the most widely used software tools for the analysis DNA microarray data, with more than 22,500 downloads in the past calendar year and more than 1,233 total citations-statistics we believe underestimate the system's overall use. MeV's development and maintenance during the past 10 years has been supported by a series of research grants and other sources. Here we propose to further develop and maintain the codebase, expanding the utility and functionality of the software to meet the challenges of genomic analysis in the coming years. The greatest challenge in keeping MeV and similar tools relevant is the onslaught of data arising from the new sequencing technologies that are poised to replace array-based analysis in many applications. These tools will open up new avenues of investigation by placing genomic analysis on a footing equal with microarrays, providing an opportunity for new approaches to integrative genomic data analysis. In this application, we will describe our plans to expand the utility of MeV by incorporating an ever-increasing number of public domain tools through integration with Cytoscape and Bioconductor, expand the number of novel algorithms developed through our work and development, extend the software and tools to work with Next Generation sequence- based genomic assays, and provide support to the growing community of users of the software and tools that exist. In doing so, we hope to advance understanding of a wide range of diseases, provide resources for interpreting data from such projects as The Cancer Genome Atlas (TCGA) and the Lung Genomics Research Consortium (LGRC), and enable and accelerate research far beyond that of our own.

Public Health Relevance

Project Narrative: MeV is a freely available, open-source software system that provides cutting-edge data analysis tools that have helped research scientists analyze and publish results from genomic studies. With more than 22,500 downloads in the past 12 months, MeV is among the world's most widely used software systems for genomic data analysis. We are requesting funds to continue to support MeV as a resource for the community and to add functionality to the software to enable it to handle the vast quantity of complex data that Next Generation DNA sequencing technologies are beginning to provide.

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Research Project--Cooperative Agreements (U01)
Project #
1U01CA151118-01A1
Application #
8148006
Study Section
Special Emphasis Panel (ZRG1-BST-E (03))
Program Officer
Li, Jerry
Project Start
2011-09-07
Project End
2016-08-31
Budget Start
2011-09-07
Budget End
2012-08-31
Support Year
1
Fiscal Year
2011
Total Cost
$608,383
Indirect Cost
Name
Dana-Farber Cancer Institute
Department
Type
DUNS #
076580745
City
Boston
State
MA
Country
United States
Zip Code
02215
Hosny, Ahmed; Parmar, Chintan; Quackenbush, John et al. (2018) Artificial intelligence in radiology. Nat Rev Cancer 18:500-510
Hicks, Stephanie C; Okrah, Kwame; Paulson, Joseph N et al. (2018) Smooth quantile normalization. Biostatistics 19:185-198
Parmar, Chintan; Barry, Joseph D; Hosny, Ahmed et al. (2018) Data Analysis Strategies in Medical Imaging. Clin Cancer Res 24:3492-3499
Kuijjer, Marieke Lydia; Paulson, Joseph Nathaniel; Salzman, Peter et al. (2018) Cancer subtype identification using somatic mutation data. Br J Cancer 118:1492-1501
Wang, Yaoyu E; Kutnetsov, Lev; Partensky, Antony et al. (2017) WebMeV: A Cloud Platform for Analyzing and Visualizing Cancer Genomic Data. Cancer Res 77:e11-e14
Domenyuk, Valeriy; Zhong, Zhenyu; Stark, Adam et al. (2017) Plasma Exosome Profiling of Cancer Patients by a Next Generation Systems Biology Approach. Sci Rep 7:42741
Kibbe, Warren; Klemm, Juli; Quackenbush, John (2017) Cancer Informatics: New Tools for a Data-Driven Age in Cancer Research. Cancer Res 77:e1-e2
Manimaran, Solaiappan; Selby, Heather Marie; Okrah, Kwame et al. (2016) BatchQC: interactive software for evaluating sample and batch effects in genomic data. Bioinformatics 32:3836-3838
Ferrari, Giovanni; Quackenbush, John; Strobeck, John et al. (2014) Comparative genome-wide transcriptional analysis of human left and right internal mammary arteries. Genomics 104:36-44
Schroder, Markus S; Gusenleitner, Daniel; Quackenbush, John et al. (2013) RamiGO: an R/Bioconductor package providing an AmiGO visualize interface. Bioinformatics 29:666-8

Showing the most recent 10 out of 11 publications