Molecular sequence databases contain approximately 5,000 independent families of protein sequence. A small number of these span multiple phyla and must represent ancient evolutionarily conserved families of proteins. For well studied phyla, most of these ancient families now appear to be represented in the molecular sequence databases. Proposed course: An algorithm, HHS, has bee developed to take pairwise similarity relations generated by the program BLASTP and to assemble these into classes of mutually related proteins. Two phases were used. In the first phase, the ungapped high scoring segments identified by BLAST are assembled into sets of mutually consistent diagonals forming a gapped sequence alignment. In the second phase, the extents of these gapped alignments two each protein are compared. Overlapping alignments indicate the presence of a protein sequence domain. A connected set definition is employed to map out each family of protein domains. The algorithm is computationally efficient and has been used to classify BLAST searches run between all pairs of the NCBI non-redundant sequence database. Future work will implement a name generator for these protein domains to allow them to be used se as an automated source of protein annotation for molecular sequences. The evolution of individual domains is also being explored.