DasSarma 9722066 To better understand the genetic basis for gas vesicle synthesis and the mechanism of genetic instability in the halophilic Archaeum, Halobacterium halobium NRC-1, plasmid pNRC100 is being sequenced in its entirety. Plasmid pNRC100 is 200 kb in size and is composed of two intersion isomers, related by recombination between two copies of 35 kb inverted repeats. It contains a cluster of genes specifying gas vesicle synthesis and many transposable IS elements which mediate DNA rearrangements. The complete sequence of pNRC100 would reveal all of the gas vesicle genes and transposable elements on this plasmid and provide a deeper understanding of the structure of this novel replicon. Moreover, the plasmid sequence will shed light for the first time on the organization of a large replicon in a halophilic Archaeum, including the distribution of genes, IS elements, and left-handed Z-DNA regions. The plasmid also represents a significant fraction of the H. halobium genome, approximately 8%, and its successful sequencing would demonstrate the feasibility of sequencing the complete H. halobium genome. For sequencing of pNRC100, we have already constructed a shotgun library in M13mp18. We plan to sequence 3,000 members of this library (about 0.5 - 1.0 kb from each clone) to obtain 6 to 8-fold coverage. After shotgun sequencing, we plan to use both GCG and Phred assemblers running on UNIX workstations for sequence assembly. We recently end-sequenced all 13 Hindlll fragments of pNRC100, which we had cloned previously, to serve as a scaffold for assisting in the assembly. After shotgun sequencing is complete, any remaining gaps will be filled by sequencing the complementary strands of M13 clones located at the ends of contigs, and bridge clones generated by PCR amplification. Ambiguities resulting from band compression in GC-rich regions will be resolved, as necessary, by radioisotope-based sequencing using modified nucleotides, different polymerases, and highly denaturing gels. The complete pNRC100 sequence will be analyzed using a battery of sequence analysis programs in the GCG and other software packages and the NCBI homology search server. Genes on pNRC100 with homologs elsewhere will be analyzed by phylogenetic analysis. Halobacteria have a very well developed genetic system, so analysis of sequenced genes can be undertaken much more easily than with other Archaea that grow in "extreme" conditions. Proteins from halobacteria have unusual properties in their abilities to function in high salt or even organic solvents. A better understanding of genes from halobacteria is likely to have applications in biotechnology.