Funds are sought to continue a computer-based study of protein evolution. The major goal is to establish the evolutionary roots of contemporary proteins. To this end, newly appearing protein sequences are entered into a computer and searches conducted against up-to-date sequence data bases for potentially related sequences. Candidate matches are subjected to appropriate statistical measures to assess the likelihood that similarities are not due to chance. Matched sequences are clustered into families and evolutionary trees constructed. During the past several years a large number of unexpected relationships have been uncovered, by us and many others, and at this point it is clear that the number of protein prototypes is a manageable number. Even with only 5,000 protein sequences in the banks, and even with a high degree of reduncance in the data bases (i.e., the same proteins from many species), the chances are now better than even that any newly determined sequences from a mammalian organism will be found to resemble a sequence already in the collection. In this regard, of the last 40 sequences entered into our collection, more than half resemble already reported proteins, species redundancies aside. This implies that we are already in a good position for classifying sequences hierarchically with regard to their origins. We are especially interested in categorizing vertebrate blood plasma proteins, many of which are the result of a certain amount of """"""""exon shuffling."""""""" This is a phenomenon that presents certain technical problems to the sequence-comparer, in that the lengths of interchanged segments are often only 40-50 residues long. The proposal considers the need for simple procedures for recognizing shuffled exons, as well as other approaches designed to correct for structural biases that occasionally make unrelated sequences appear similar. The need for simple procedures that can recognize relationships among proteins on the basis of their amino acid sequences alone will increasingly be felt as large-scale sequencing projects (the complete human genome, for example) are undertaken.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
2R01GM034434-04
Application #
3285395
Study Section
Biophysics and Biophysical Chemistry A Study Section (BBCA)
Project Start
1984-12-01
Project End
1990-11-30
Budget Start
1988-03-01
Budget End
1988-11-30
Support Year
4
Fiscal Year
1988
Total Cost
Indirect Cost
Name
University of California San Diego
Department
Type
Schools of Medicine
DUNS #
077758407
City
La Jolla
State
CA
Country
United States
Zip Code
92093
Smith, M W; Doolittle, R F (1992) A comparison of evolutionary rates of the two major kinds of superoxide dismutase. J Mol Evol 34:175-84
Nagel, G M; Doolittle, R F (1991) Evolution and relatedness in two aminoacyl-tRNA synthetase families. Proc Natl Acad Sci U S A 88:8121-5
Seely Jr, O; Feng, D F; Smith, D W et al. (1990) Construction of a facsimile data set for large genome sequence analysis. Genomics 8:71-82
Doolittle, R F; Riley, M (1990) The amino-terminal sequence of lobster fibrinogen reveals common ancestry with vitellogenins. Biochem Biophys Res Commun 167:16-9
Doolittle, R F; Feng, D F; Anderson, K L et al. (1990) A naturally occurring horizontal gene transfer from a eukaryote to a prokaryote. J Mol Evol 31:383-8
Xu, X; Doolittle, R F (1990) Presence of a vertebrate fibrinogen-like sequence in an echinoderm. Proc Natl Acad Sci U S A 87:2097-101
Doolittle, R F; Feng, D F; Johnson, M S et al. (1989) Origins and evolutionary relationships of retroviruses. Q Rev Biol 64:1-30
McClure, M A; Johnson, M S; Feng, D F et al. (1988) Sequence comparisons of retroviral proteins: relative rates of change and general phylogeny. Proc Natl Acad Sci U S A 85:2469-73
Feng, D F; Doolittle, R F (1987) Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J Mol Evol 25:351-60
McClure, M A; Johnson, M S; Doolittle, R F (1987) Relocation of a protease-like gene segment between two retroviruses. Proc Natl Acad Sci U S A 84:2693-7

Showing the most recent 10 out of 14 publications