Our long-term objectives are to develop robust algorithms that can predict protein tertiary structure and the quaternary structure of DNA-protein complexes and to apply the methodology to important proteomes. Protein structures are important because they can assist in the elucidation of protein function. This is essential as the functions of roughly half the proteins in a given proteome are unknown. The proposed research builds on the recently developed and promising TASSER structure prediction algorithm that employs threading identified templates to provide continuous structural fragments and predicted tertiary contacts followed by fold assembly/refinement protocols. TASSER provides reasonable models for -70% of the single domain proteins that are weakly homologous to proteins with solved structures and often provides a significant improvement over the input threading alignment. To extend this approach and to address identified weaknesses, the following Specific Aims are proposed: (1) For single domain proteins, the performance of TASSER in the template free limit will be improved. At present, this is the major weakness of TASSER. (2) TASSER will be extended to better predict the tertiary structure of membrane proteins. (3)TASSER will be extended to explicitly include prosthetic groups, metal ions and small ligands in the protein modeling procedure, with the goal of producing more accurate structural predictions. (4) TASSER will be extended to better treat multidomain proteins. Currently, prediction success depends on whether the domain orientations in the target and template structures are similar. (5) TASSER will be extended to predict the structure of proteins bound to DNA. Then, we shall apply a recently developed algorithm that predicts whether a protein will bind DNA, and if so, model the structure of the DNA-protein complex, ultimately on a proteomic scale. (6) The effect of alternative splicing on the structure of single domain proteins will be explored. (7) Tertiary structure prediction of proteins less than 300 residues in length in a large number of proteomes will be done.
Specific Aims 1 -5represent methodological advances, whereas Specific Aims 6 &1 are designed to apply the improved TASSER algorithm to biologically important problems. For all Specific Aims, comprehensive benchmarking that includes participation in future CASPs will be done. All developed algorithms, tools, and results will be made available on our website, http://cssb.biology.gatech.edu/skolnick/.
Showing the most recent 10 out of 121 publications