This project extends and applies the hidden Markov model methods, which were developed under previous NSF support, to the problems of finding remote homologies between proteins, producing multiple alignments of related protein domains, and classifying protein domains into families and subfamilies based on common structure, function and phylogenetic relationship. New methods of combining mRNA expression data along with sequence similarity to do functional analysis and discover pathways of interactions between proteins are also being developed. More sophisticated models of protein structure will provide better recognition of active sites and better detection of remote homologies. In developing more powerful and comprehensive computational methods for protein classification and analysis, and making them available on the world wide web, this work will contribute to the basic study of a wide variety of fundamental protein families and may also lead to the discovery of important new protein families.