The information content of the human genome is expanded by the covalent methylation of cytosine residues, which introduces approximately 3 X 10(7) residues of 5-methylcytosine per haploid genome. Methylation triggers assembly of affected sequences into a hypoacetylated and condensed state that inhibits transcription and recombination. Abnormalities of genomic methylation patterns are involved in carcinogenesis and at least two fatal genetic disorders, and disruption of methylation patterns in mice is lethal and is associated with abnormalities of genome stability, genomic imprinting, and X inactivation. A full understanding of the function and organization of the human genome will therefore require an understanding of the superimposed methylation patterns. However, the large-scale organization of genomic methylation patterns is very poorly understood. This has led to uncertainty and controversy as to the biological functions of cytosine methylation. We propose to map the methylation landscape of the human genome by a post-genomic approach that involves the application of new, simple, and robust methods for the selective extraction of methylated and unmethylated sequences. Unmethylated genomic libraries will be constructed by selective removal of methylated sequences by McrBC nuclease from E. coli, and methylated libraries will be made by degradation of unmethylated sequences by multiple methylation-sensitive restriction endonucleases. We will first map all heavily methylated regions and all unmethylated regions within chromosome 21, and upon validation of the method will extend analysis to the rest of the genome. These data will be analyzed online as they emerge from automated high-throughput sequencers and added as annotation to the human genome browser developed by Drs. Haussler and Grundy and their colleagues. Sequencing of McrBC libraries will provide full coverage of all CpG island sequences, which mark the 5' ends of most genes. These data will be of great importance in the objective definition of 5' exons and start sites, which have been difficult to identify by purely computational means. We have also developed a simple subtractive hybridization method for the isolation of sequences that are differentially methylated between tissues or developmental stages, or between normal and cancerous tissues. In the latter case, the method provides a genome- wide scan for known tumor suppressor genes that might have been silenced by methylation; new candidate tumor suppressor genes will also be identified. The method will also isolate both known and novel imprinted genes. The impending completion of the sequence of the human genome presents a unique opportunity to gain understanding of the shape of genomic methylation patterns and their role in human development and disease.