In recent years there have been exciting breakthroughs in the application of computational methods to problems in cancer genomics. Machine learning techniques applied to gene expression data have been used to address the questions of distinguishing tumor morphology, predicting post-treatment outcome, and finding molecular markers for disease. While these studies have been very promising, significant challenges remain. Extracting biological knowledge from microarray-based gene expression data is difficult. The development of robust and accurate expression-based classifiers of biological and clinical states is similarly problematic. Biologists do not have access to an integrated set of robust, sophisticated analytical tools. Our goal is to develop, implement, and distribute computational genomics methods that address these challenges in the gene expression profiling field.
Aim 1 : Capture the behavior of a set of genes representing a pathway or state of the cell to reduce a list of thousands of expressed genes into a few hundred metagenes. Metagenes should filter the noise, technical variation, and idiosyncrasies of the data, and capture the actual molecular logic or relevant biological correlations and structure in the data.
Aim 2 : Develop a robust and validated computational methodology for using metagene markers for classification.
Aim 3 : Develop and distribute an integrated software package, GenePattern, to put the power of sophisticated computational methods into the hands of the biomedical research community. Our extensive experience developing methods, analyzing patient sample data, and creating and distributing software tools for this area of research makes us well suited to carry out this program.

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Exploratory/Developmental Grants Phase II (R33)
Project #
5R33CA097556-04
Application #
6756446
Study Section
Genome Study Section (GNM)
Program Officer
Couch, Jennifer A
Project Start
2002-07-01
Project End
2006-06-30
Budget Start
2004-07-01
Budget End
2006-06-30
Support Year
4
Fiscal Year
2004
Total Cost
$535,007
Indirect Cost
Name
Massachusetts Institute of Technology
Department
Type
Organized Research Units
DUNS #
001425594
City
Cambridge
State
MA
Country
United States
Zip Code
02139
Cho, Yoon-Jae; Tsherniak, Aviad; Tamayo, Pablo et al. (2011) Integrative genomic analysis of medulloblastoma identifies a molecular subgroup that drives poor clinical outcome. J Clin Oncol 29:1424-30