It is now widely accepted that cancer is a genetic disease, mediated in large part by alterations in specific genes. However, many of the key tumor suppressor genes and oncogenes responsible for cancer initiation and progression remain to be identified. Although technical hurdles have in the past limited our ability to identify such genes in a comprehensive fashion, the delineation of the sequence of the human genome, coupled with recent advances in high-throughput DNA analysis technologies, has created an unprecedented opportunity for major progress in this field. Perlegen Sciences has developed a sophisticated high-throughput re-sequencing platform that uses high density oligonucleotide microarrays (HDOMAs) to enable analysis of millions of base-pairs at single base resolution at a fraction of the cost of conventional sequencing technology. To identify and characterize novel cancer genes, we propose to use this platform to re-sequence all coding sequences of all known genes in a panel of colorectal carcinomas (CRCs). To demonstrate the feasibility and utility of this approach, we propose in this Phase I application to analyze at single base resolution all coding exons from chromosomes 1p, 6, 15, and 18 (approximately 16% of the genome), in each of 12 CRC genomes. From our experience using HDOMAs to re-sequence the short arm of chromosome 8 in a panel of CRC tumors, we anticipate mutational analysis of all known genes on chromosomes 1p, 6, 15, and 18 in 12 tumor genomes will result in the identification of approximately 30,000 DNA variants, comprised principally of germline SNPs. To discriminate tumor-specific (somatic) mutations from germline variants, we will use Perelgen's polymorphism database to filter known SNPs, and use custom genotyping HDOMAs to evaluate the expected remaining approximately 4,000 candidate somatic mutations in the tumor DNAs and in normal DNA from the same patients. We expect this effort will result in the identification of approximately 100 bona-fide somatic mutations, including non-synonymous mutations affecting approximately as many genes. To assess the involvement of these affected genes in tumorigenesis more generally, as well as to more thoroughly characterize the spectrum of mutations that occur in these genes, we will use conventional automated sequencing technology to evaluate the affected genes for additional tumor-specific alterations in an expanded panel of 50 tumor/normal sample pairs. We anticipate that these results will demonstrate the power of our approach and will support a Phase II effort to analyze coding sequences across the genome. Over 130,000 individuals are diagnosed in the US with CRC each year; 55,000 of these individuals will die from the disease. Our ultimate goal is the comprehensive identification of genetic alterations in CRC. As our recent mutational analyses of the tyrosine kinase gene family have shown, the results of such comprehensive genetic analyses of cancer are likely to be extremely productive, revealing a large number of mutated genes not previously implicated in tumorigenesis. Such results will ultimately lead to greater understanding of cancer etiology, improved tools for cancer detection and diagnosis, new targets for therapeutic and preventative intervention, and opportunities for individualized treatment. To maximize the impact of our efforts, all data for all somatic mutations identified in this study will be deposited to the Human Gene Mutation Database at the IMG in Cardiff and to the Cancer Genome Anatomy Project at NCI.