Identification of protein families and the sequence motifs is increasingly important as sequence information from the Human Genome Project begins to be available. In earlier work supported by this award, we developed a technique for calculating sequence based motif descriptions or evolutionary profiles (EP). Briefly, this method fits an explicit evolutionary model to each aligned position in a group of aligned sequences. This model describes the best evolutionary distance for each of the twenty amino acid residues. A finite mixture model is then calculated in which each of the twenty possible ancestral residues is weighted by its probability of giving rise to the observed distribution at the given evolutionary distance. In continuing work this year we are using the EP method to construct models for the entire family of serine/threonine and tyrosine kinases. This project involves the construction of multiple alignments for each of the five major kinase families and 55 subfamilies, and the subsequent generation of EP. EP are effective description/classifiers for protein families, and the kinase profiles will be used to classify novel kinases and provide multiple sequence alignments for the Protein Kinase Resource at SDSC. Continual improvements to profile analysis methods are resulting from this """"""""testbed"""""""" use of the software. Based on experience from the last year we have implemented a new and more flexible server for pattern recognition. This server, SeqWeb, is designed to support multiple application in a variety of modes of interaction. Providing end-user applications via the internet poses a variety of difficulties. While some applications can be completed in short periods of time, and thus are good candidates for providing services via an interactive web page, others require minutes or hours to finish and must therefore employ a different interactive mode. Furthermore, many applications require inputs from other programs, or provide inputs to other programs. For a traditional interactive server, this requires frequent copying of data to and from the server; each step is error prone and increases the chance of errors. The SeqWeb server has been designed to address many of these issues. Temporary file storage is provided to facilitate the use of a series of programs. Local file storage also allows the system to more closely control the format of files, and obviates many tedious format conversions and errors introduced by file transfers. This also provides for a drop-off and pick-up interaction for analyses that require more than a few minutes. In addition, SeqWeb provides for the return of results by web pages (traditional mode) and by email (similarly to the MEME/MAST server). These three modes of interaction provide greatly enhance flexibility in provided access to applications. The current version of SeqWeb implements a series of programs for use with the evolutionary profile method described above: specifically, multiple sequence alignment, sequence weighting, profile creation, profile database searching (using Bioccelerator), and alignment of sequences and profiles. SeqWeb also provides ac cess to databases available locally, and tools for the custom display of alignments. In the next year SeqWeb will be extended to include the MEME and MAST programs developed in collaboration with NBCR, and its functionality extended by the addition of other display and analysis functions.
Showing the most recent 10 out of 270 publications