? The advent of genomics has enabled a wide range of technologies, including expression profiling, proteomics and metabolomics. These technologies allow us to use a holistic approach to analyzing biological systems so as to understand how they respond to their environments, the developmental program, and perturbations that result in the development of disease states. The improving qualify and robustness of experimental assays using 'omic technologies, combined with falling costs, have resulted in their increasingly wide-spread use. While producing the large datasets that these technologies provide has become easier, for many, the challenge remains one of collecting, managing, and analyzing the data resulting from large scale studies. Although commercial software solutions exist, the vast majority are """"""""closed box"""""""" systems that provide little flexibility and little adaptability at a high cost. The need to address issues of data analysis in an open and extensible way was recognized in the NIH Roadmap for Bioinformatics and Computational Biology as one of the issues key to the success of future medical research. Within this broader context, our group has been working to create software tools for the analysis of genomic-scale expression data. The TM4 software system was developed as a freely-distributed, open-source software system for the analysis of microarray data. With more than 10,000 registered users, and conservatively, four to six times that total number of users, TM4 is one of the most widely used software systems for the analysis of microarray data. Developed to support a range of ongoing expression analysis projects, TM4 has benefited from close collaboration between laboratory scientists and software developers to put advanced algorithms for the analysis of high-dimensional datasets into a form that can be widely used. Our goal in this proposal is to continue to refine and expand these tools, incorporating improvements and additions made by our group arid others, and to continue to improve and maintain the code, providing users with regular, stable, well-documented updates. ? ? ?

National Institute of Health (NIH)
National Library of Medicine (NLM)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZLM1-HS-E (M3))
Program Officer
Ye, Jane
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Dana-Farber Cancer Institute
United States
Zip Code
Skiadas, Christine C; Duan, Shenghua; Correll, Mick et al. (2012) Ovarian reserve status in young women is associated with altered gene expression in membrana granulosa cells. Mol Hum Reprod 18:362-71
Howe, Eleanor A; Sinha, Raktim; Schlauch, Daniel et al. (2011) RNA-Seq analysis in MeV. Bioinformatics 27:3209-10
Liu, Fenglong; White, Joseph A; Antonescu, Corina et al. (2011) GCOD - GeneChip Oncology Database. BMC Bioinformatics 12:46
Chervitz, Stephen A; Deutsch, Eric W; Field, Dawn et al. (2011) Data standards for Omics data: the basis of data sharing and reuse. Methods Mol Biol 719:31-69
Dutta, Bhaskar; Kanani, Harin; Quackenbush, John et al. (2009) Time-series integrated ""omic"" analyses to elucidate short-term stress-induced responses in plant liquid cultures. Biotechnol Bioeng 102:264-279
Colak, Dilek; Kaya, Namik; Al-Zahrani, Jawaher et al. (2009) Left ventricular global transcriptional profiling in human end-stage dilated cardiomyopathy. Genomics 94:20-31
Bateman, Alex; Quackenbush, John (2009) Bioinformatics for next generation sequencing. Bioinformatics 25:429
Chittenden, Thomas W; Howe, Eleanor A; Culhane, Aedin C et al. (2008) Functional classification analysis of somatically mutated genes in human breast and colorectal cancers. Genomics 91:508-11