Protein structure alignment is an essential tool in protein comparison, and is used to understand protein functionality stability, specificity and evolution. The more accurate an alignment method is, the more we can understand the similarities and differences between protein functionality, and in return the more we can do in terms of developing new and more effective therapeutics to cure diseases. Recently, we have developed a novel, accurate approach to protein structure alignment, the TOPOFIT method, which outperforms all other current methods in accuracy. The TOPOFIT method is also computationally efficient and manageable on a large scale, which permitted us to conduct one of the largest computational protein structure comparison experiments on all protein structures in the PDB (by July 2005). A comprehensive database of the comparison has been recently released to the public, TOPODIT-DB (NAR, 2007), and contains more then 86,000,000 structural alignments, making it one of the largest protein structure comparison databases currently available. This new method also brings new discoveries and opens new opportunities. First of all we have discovered that the method allows natural clustering of protein structures;second, a large body of strong structural relations across different folds has been found, which were previously undetected;third, a systematic widespread occurrence of non-sequential alignments has been found, which appear to be widely distributed in all protein folds and across folds, and thus might present a common rule of protein stability and a missing component in protein evolution studies. We propose to employ the advantages and new opportunities provided by the TOPOFIT approach to the systematic analysis of protein structures in general and to the application of specific biological problems toward developing new insight into protein stability, functionality, specificity and evolution, and facilitate the development of new therapeutics to cure diseases. We also propose the further and deeper development of the method which will incorporate physical and chemical rules in protein comparison. And finally we plan to implement these technologies in publicly available bioinformatics resources, TOPOFIT-DB and Friend software, and create new web servers where they are needed.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Research Project (R01)
Project #
5R01LM009519-02
Application #
7619629
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Ye, Jane
Project Start
2008-05-01
Project End
2010-04-30
Budget Start
2009-05-01
Budget End
2010-04-30
Support Year
2
Fiscal Year
2009
Total Cost
$238,140
Indirect Cost
Name
Northeastern University
Department
Biology
Type
Schools of Arts and Sciences
DUNS #
001423631
City
Boston
State
MA
Country
United States
Zip Code
02115