One (1) of the fundamental problems in computational biology is the prediction of a protein's 3D structural class -- that is, recognition of its fold from its linear sequence of amino acids. The proposed project aims to develop computational methods and tools for recognizing protein folds. The first specific aim involves building and delivering to the scientific community a web-based, discriminative fold-recognition software engine. This tool will instantiate for the first time in a user-friendly form a discriminative fold-recognition algorithm. This type of algorithm has been described and repeatedly validated in the scientific literature over the past 5 years, but no easy-to-use software tools yet exists to bring this technology to the end user. The second specific aim improves upon existing fold-recognition algorithms by exploiting the inherently multiclass nature of the problem. Previous approaches have treated each fold class independently, thereby sacrificing statistical power. This project will produce algorithms and software that dramatically improve our ability to recognize, from the primary amino acid sequence, subtle structural similarities among proteins.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM074257-04
Application #
7492267
Study Section
Special Emphasis Panel (ZRG1-BPC-Q (03))
Program Officer
Wehrle, Janna P
Project Start
2005-05-01
Project End
2010-06-30
Budget Start
2008-07-01
Budget End
2009-06-30
Support Year
4
Fiscal Year
2008
Total Cost
$299,068
Indirect Cost
Name
Sloan-Kettering Institute for Cancer Research
Department
Type
DUNS #
064931884
City
New York
State
NY
Country
United States
Zip Code
10065
Qi, Yanjun; Oja, Merja; Weston, Jason et al. (2012) A unified multitask architecture for predicting local protein properties. PLoS One 7:e32235
Melvin, Iain; Weston, Jason; Noble, William Stafford et al. (2011) Detecting remote evolutionary relationships among proteins by large-scale semantic embedding. PLoS Comput Biol 7:e1001047
Agius, Phaedra; Arvey, Aaron; Chang, William et al. (2010) High resolution models of transcription factor-DNA affinities improve in vitro and in vivo binding predictions. PLoS Comput Biol 6:
Melvin, Iain; Weston, Jason; Leslie, Christina et al. (2009) RANKPROP: a web server for protein remote homology detection. Bioinformatics 25:121-2
Nimrod, Guy; Szilágyi, András; Leslie, Christina et al. (2009) Identification of DNA-binding proteins using structural, electrostatic and evolutionary features. J Mol Biol 387:1040-53
Melvin, Iain; Weston, Jason; Leslie, Christina S et al. (2008) Combining classifiers for improved classification of proteins from sequence or structure. BMC Bioinformatics 9:389
Melvin, Iain; Ie, Eugene; Kuang, Rui et al. (2007) SVM-Fold: a tool for discriminative multi-class protein fold and superfamily recognition. BMC Bioinformatics 8 Suppl 4:S2
Weston, Jason; Kuang, Rui; Leslie, Christina et al. (2006) Protein ranking by semi-supervised network propagation. BMC Bioinformatics 7 Suppl 1:S10
Kuang, Rui; Weston, Jason; Noble, William Stafford et al. (2005) Motif-based protein ranking by network propagation. Bioinformatics 21:3711-8