9632085 The long term goal of this project is to determine the total DNA sequence of the Arabidopsis thaliana genome. For the next three years, the specific goal of this project is to produce at least 8.5 Mb of finished sequences of DNA using a two-pronged, BAC-based strategy for generating megabase quantity of high quality genome sequence data. One focus of the strategy will be to perform single-pass sequencing from the ends of BAC clones from a well-constructed genomic library of Arabidopsis. The second focus will be on the complete sequencing of BAC clones using a random shotgun approach. When each BAC has been completely sequenced and assembled, the BAC contig sequence will be compared to the database of BAC end sequences. A BAC or BACs which have minimal sequence overlap with the completed BAC will then be selected for the next round of shotgun library construction and sequencing. By repeating this sequence/search process after the completion of each BAC, a sequence-based map of the minimal tiling path of Arabidopsis BACs will be generated for the genome without requiring a separate BAC mapping effort. The finished sequence data will be released on TIGR's web site and in GenBank three months from the time when work begins on the sequencing of each selected BAC. Clones will be freely available through the Arabidopsis Biological Resource Center at Ohio State. This award is one of the three awarded by the triagency (Department of Energy, National Science Foundation, and US Department of Agriculture) Arabidopsis thaliana genome research program. The activity of this project will be coordinated with the other groups engaged in large scale sequencing of the Arabidopsis genome. The results will contribute to determining the overall strategy for the completion of the sequence of the entire Arabidopsis genome. More importantly, the information and data produced will be useful to the general research community and will contribute to rapid advances in pl ant biology.