The most transformative aspect of the Human Genome Project was not the sequence of the genome itself, but the technologies that the project has spawned. Genomics, functional genomics, proteomics, metabolomics, systems biology, and other research areas are fundamentally driven by new technologies. The 1995 publication of methods microarray gene expression analysis launched a revolution in biological inquiry in which reliable technologies changed the scale at which we were able to generate data from biological systems and started the transformation of biomedical research from a purely laboratory science to an information science. One of the greatest challenges in this information-rich era of biomedical science has been our ability to effectively collect, manage, and analyze the data that even small laboratories now regularly produce. As early adopters of DNA microarray technology, our group combined laboratory inquiry with software development to address biological questions. By applying state-of-the-art statistical and data mining approaches, we created a series of open-source, easy-to-use software tools that provide access to these methods in a manner that allows even those with limited bioinformatics experience to effectively explore their data and reach testable hypotheses regarding the underlying biology. Of these, MeV, has become one of the most widely used software tools for the analysis DNA microarray data, with more than 22,500 downloads in the past calendar year and more than 1,233 total citations-statistics we believe underestimate the system's overall use. MeV's development and maintenance during the past 10 years has been supported by a series of research grants and other sources. Here we propose to further develop and maintain the codebase, expanding the utility and functionality of the software to meet the challenges of genomic analysis in the coming years. The greatest challenge in keeping MeV and similar tools relevant is the onslaught of data arising from the new sequencing technologies that are poised to replace array-based analysis in many applications. These tools will open up new avenues of investigation by placing genomic analysis on a footing equal with microarrays, providing an opportunity for new approaches to integrative genomic data analysis. In this application, we will describe our plans to expand the utility of MeV by incorporating an ever-increasing number of public domain tools through integration with Cytoscape and Bioconductor, expand the number of novel algorithms developed through our work and development, extend the software and tools to work with Next Generation sequence- based genomic assays, and provide support to the growing community of users of the software and tools that exist. In doing so, we hope to advance understanding of a wide range of diseases, provide resources for interpreting data from such projects as The Cancer Genome Atlas (TCGA) and the Lung Genomics Research Consortium (LGRC), and enable and accelerate research far beyond that of our own.

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Research Project--Cooperative Agreements (U01)
Project #
5U01CA151118-03
Application #
8518262
Study Section
Special Emphasis Panel (ZRG1-BST-E (03))
Program Officer
Li, Jerry
Project Start
2011-09-07
Project End
2016-08-31
Budget Start
2013-09-01
Budget End
2014-08-31
Support Year
3
Fiscal Year
2013
Total Cost
$539,962
Indirect Cost
$231,622
Name
Dana-Farber Cancer Institute
Department
Type
DUNS #
076580745
City
Boston
State
MA
Country
United States
Zip Code
02215
Hosny, Ahmed; Parmar, Chintan; Quackenbush, John et al. (2018) Artificial intelligence in radiology. Nat Rev Cancer 18:500-510
Hicks, Stephanie C; Okrah, Kwame; Paulson, Joseph N et al. (2018) Smooth quantile normalization. Biostatistics 19:185-198
Parmar, Chintan; Barry, Joseph D; Hosny, Ahmed et al. (2018) Data Analysis Strategies in Medical Imaging. Clin Cancer Res 24:3492-3499
Kuijjer, Marieke Lydia; Paulson, Joseph Nathaniel; Salzman, Peter et al. (2018) Cancer subtype identification using somatic mutation data. Br J Cancer 118:1492-1501
Kibbe, Warren; Klemm, Juli; Quackenbush, John (2017) Cancer Informatics: New Tools for a Data-Driven Age in Cancer Research. Cancer Res 77:e1-e2
Wang, Yaoyu E; Kutnetsov, Lev; Partensky, Antony et al. (2017) WebMeV: A Cloud Platform for Analyzing and Visualizing Cancer Genomic Data. Cancer Res 77:e11-e14
Domenyuk, Valeriy; Zhong, Zhenyu; Stark, Adam et al. (2017) Plasma Exosome Profiling of Cancer Patients by a Next Generation Systems Biology Approach. Sci Rep 7:42741
Manimaran, Solaiappan; Selby, Heather Marie; Okrah, Kwame et al. (2016) BatchQC: interactive software for evaluating sample and batch effects in genomic data. Bioinformatics 32:3836-3838
Ferrari, Giovanni; Quackenbush, John; Strobeck, John et al. (2014) Comparative genome-wide transcriptional analysis of human left and right internal mammary arteries. Genomics 104:36-44
Schroder, Markus S; Gusenleitner, Daniel; Quackenbush, John et al. (2013) RamiGO: an R/Bioconductor package providing an AmiGO visualize interface. Bioinformatics 29:666-8

Showing the most recent 10 out of 11 publications