This study is directed toward finding the evolutionary roots of contemporary proteins, primarily by the computer-assisted comparison of available sequences. The study involves (a) the searching of a large protein sequence collection for potentially related sequences, (b) the computer alignment of possibly related sequences and (c) the application of empirical and statistical tests to find the likelihood that the proteins being studied have shared common ancestry. Phylogenetic trees will be constructed wherever appropriate. A high speed, large memory computer is essential for all of these aspects. The required computer programs are all on hand, as is a large protein sequence data base (NEWAT). All newly reported sequences are searched against the data base as they appear. The proteins to be studied fall into several groups. First there are those proteins """"""""unique"""""""" to vertebrates but which most likely have common ancestry with pre-vertebrate proteins. Examples are blood plasma proteins like albumin, fibrinogen, fibronectin, blood clotting proteins, ceruloplasmin, transferrin, the complement proteins and Alpha-2-macroglobulin. Many of these proteins have internally duplicated sequences that can serve as clues to the times of invention or divergence. These studies can shed considerable light on structure-function relationships by underlining which features of a protein have been most rigorously conserved during evolution. In addition to those proteins unique to vertebrates, some mainstream enzymes and ancient proteins thought to be essential to all living creatures will be studied. These include nucleic acid polymerases, tRNA synthetases and mainstream glycolytic enzymes. By using a battery of computer methods in combination we hope to be able to tease out resemblances that were not previously discernable.
Showing the most recent 10 out of 14 publications