The most transformative aspect of the Human Genome Project was not the sequence of the genome itself, but the technologies that the project has spawned. Genomics, functional genomics, proteomics, metabolomics, systems biology, and other research areas are fundamentally driven by new technologies. The 1995 publication of methods microarray gene expression analysis launched a revolution in biological inquiry in which reliable technologies changed the scale at which we were able to generate data from biological systems and started the transformation of biomedical research from a purely laboratory science to an information science. One of the greatest challenges in this information-rich era of biomedical science has been our ability to effectively collect, manage, and analyze the data that even small laboratories now regularly produce. As early adopters of DNA microarray technology, our group combined laboratory inquiry with software development to address biological questions. By applying state-of-the-art statistical and data mining approaches, we created a series of open-source, easy-to-use software tools that provide access to these methods in a manner that allows even those with limited bioinformatics experience to effectively explore their data and reach testable hypotheses regarding the underlying biology. Of these, MeV, has become one of the most widely used software tools for the analysis DNA microarray data, with more than 22,500 downloads in the past calendar year and more than 1,233 total citations-statistics we believe underestimate the system's overall use. MeV's development and maintenance during the past 10 years has been supported by a series of research grants and other sources. Here we propose to further develop and maintain the codebase, expanding the utility and functionality of the software to meet the challenges of genomic analysis in the coming years. The greatest challenge in keeping MeV and similar tools relevant is the onslaught of data arising from the new sequencing technologies that are poised to replace array-based analysis in many applications. These tools will open up new avenues of investigation by placing genomic analysis on a footing equal with microarrays, providing an opportunity for new approaches to integrative genomic data analysis. In this application, we will describe our plans to expand the utility of MeV by incorporating an ever-increasing number of public domain tools through integration with Cytoscape and Bioconductor, expand the number of novel algorithms developed through our work and development, extend the software and tools to work with Next Generation sequence- based genomic assays, and provide support to the growing community of users of the software and tools that exist. In doing so, we hope to advance understanding of a wide range of diseases, provide resources for interpreting data from such projects as The Cancer Genome Atlas (TCGA) and the Lung Genomics Research Consortium (LGRC), and enable and accelerate research far beyond that of our own.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Research Project--Cooperative Agreements (U01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-BST-E (03))
Program Officer
Li, Jerry
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Dana-Farber Cancer Institute
United States
Zip Code
Ferrari, Giovanni; Quackenbush, John; Strobeck, John et al. (2014) Comparative genome-wide transcriptional analysis of human left and right internal mammary arteries. Genomics 104:36-44
Schroder, Markus S; Gusenleitner, Daniel; Quackenbush, John et al. (2013) RamiGO: an R/Bioconductor package providing an AmiGO visualize interface. Bioinformatics 29:666-8
Milbury, Coren A; Correll, Mick; Quackenbush, John et al. (2012) COLD-PCR enrichment of rare cancer mutations prior to targeted amplicon resequencing. Clin Chem 58:580-9