The need for rigorous statistics in sequence and structure comparison is now generally conceded, particularly in light of the success of the BLAST suite of programs at NCBI. Moving from applying statistics in sequence problems to applying them in structure problems is clearly the next step in sorting out distant relations between proteins. Protein threading is the operation of taking a query sequence, and testing its ability to match members of a structural database. The structures may be thought of as tubes (secondary structures) that fix orientations and distances for interactions between protein residues. The residues in the query sequence are like differently colored beads in a string. The """"""""string of beads"""""""" is """"""""threaded"""""""" through the """"""""tubes"""""""" until interactions between the protein residues are as favorable as possible. The question then arises: is the most favorable interaction discovered consistent with chance alone, or does it indicate a systematic similarity of query sequence's structure with the tentative threading structure? We have some initial approximations to the statistical significance that are encouraging, in that they systematically and consistently overestimate the true p-values by a factor of about 300. We are currently exploring the protein structural correlations that produce this overestimate, to correct it.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Intramural Research (Z01)
Project #
1Z01LM000080-02
Application #
6111079
Study Section
Special Emphasis Panel (BRB)
Project Start
Project End
Budget Start
Budget End
Support Year
2
Fiscal Year
1998
Total Cost
Indirect Cost
Name
National Library of Medicine
Department
Type
DUNS #
City
State
Country
United States
Zip Code