The VAST computer program at NCBI currently performs its structure neighboring by setting up a graph representing matching secondary structure elements, searching for cliques within that graph, and then summing the scores the edges of the matching clique graph. The statistical significance of the final sum is then calculated to decide whether two structures are similar. A general theory of systematically approximating most powerful statistics for database searches was applied to this problem as a specific testing ground. The automatic application of the general theory led to a 20% improvement in retrieval of matching structures, with no obvious degradation in the quality of the structure matches. The improvement will be incorporated into the next version of the VAST computer program. The theory also suggested several interesting directions for improving the sensitivity of available statistics. These will be explored systematically over the coming year. - Protein Structure Comparison, Database Statistics

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Intramural Research (Z01)
Project #
1Z01LM000087-01
Application #
6228044
Study Section
Special Emphasis Panel (CBB)
Project Start
Project End
Budget Start
Budget End
Support Year
1
Fiscal Year
1999
Total Cost
Indirect Cost
Name
National Library of Medicine
Department
Type
DUNS #
City
State
Country
United States
Zip Code