This project is a continuing study of questions concerning what similarities can be expected to occur purely by chance when two protein or DNA sequences are compared. A subsidiary and related question concerns the definition of scoring systems that are optimal for distinguishing biologically meaningful patterns from chance similarities. Advances this year include: a) The definition of improved scoring systems for DNA sequence comparison; b) An analysis of when protein comparison is more sensitive to biological relations than DNA comparison; c) The definition of a scoring system for macromolecular sequence comparison that is sensitive to similarities at all evolutionary distances, and an analysis of its statistics; d) The development of statistics for the sum of the scores of high-scoring segment pairs; e) The development of Poisson statistics for consistent high-scoring segment pairs; f) The estimation of optimal gap costs for pairwise sequence comparison.