Novel and widely applicable methods are described to (i) isolate promoter-proximal sequences and 5' exons from genes expressed in cultured cells, (ii) isolate 5' sequences from regions of individual human chromosomes, and (iii) to clone full-length cDNAs using 5' exon sequences. Promoter trap retrovirus shuttle vectors will be used to isolate a large number of cell clones in which the virus has inserted in or near 5' exons of expressed genes. Libraries of expressed flanking sequences will be cloned by plasmid rescue and analyzed to determine the number of transcriptionally active sites that can activate virus gene expression and the extent to which expressed cellular genes can be recovered in rescued libraries. Methods for subtracting DNA libraries will be developed to isolate promoter regions and 5' exons from genes expressed in one cell type but not another. These methods will be used to isolate 5' regions of genes expressed on human chromosome 4 from somatic cell hybrids. Promoter proximal sequences from genes expressed on human chromosome 4 will also be isolated by a modified Alu-PCR protocol. Finally, other experiments win explore the possibility of using expressed flanking sequences to enzymatically amplify full-length cDNAs derived from genes disrupted as a result of promoter trap selection. In principle, expressed flanking sequence libraries can be used to locate 5' promoter regions of genes present in cosmid or YAC clones and will facilitate isolating full-length cDNA clones. Automated sequencing of the flanking sequences will provide functional landmarks in an emerging genome sequence and provide promoter tagged sites (PTSs), analogous to sequence tagged sites (STSs) for use in mapping studies.