Everyday experts derive hypotheses about protein function and structure from looking at patterns in protein families. The hope is to speculate whether protein X binds to Y, and if so, where. Two published methods address the goal of predicting protein interactions from sequence. The first simply uses a particular region in the hydrophobic moment plot to suggest binding interfaces; it predicts too many putative interfaces (66% of all residues). The second predicts interfaces from the co-occurrence of sequence motifs and domains, i.e. by using existing experimental information in a clever way. Obviously, this method cannot predict novel interactions. Here, we propose to develop methods predicting interface segments, i.e. regions of residues consecutive in sequence that are in contact with other interface segments. We propose separate methods for internal and external interfaces. We have recently shown that internal, chain-chain, and protein-protein interface segments differ - on average. Thus, our first aim is to predict the interface type based on a combination of sequence features and predicted properties (secondary structure, accessibility, and post-translational modifications). Such a prediction will already provide the first hint for functional regions, and it will simplify the next step.
The second aim will be to develop a system that suggests possible protein-protein interaction partners, as well, as putative oligomer interfaces. The third goal will be to develop new methods that predict distances between internal interfaces. This final step will hopefully complement the prediction of external interfaces and will assist methods predicting protein structure. The basic means explored will be combinations of statistics and neural networks using multiple sequence alignments. The goal is a low-resolution prediction succeeding often enough to distinguish between internal and external interfaces to assist the design of experiments in molecular and medical biology. In the worst case, we anticipate to develop novel techniques that will enable experimentalists to correctly identify the most likely protein-protein interaction segments most of the time. In the best case, we hope to introduce methods that allow automatic discoveries for entire proteomes and considerably help structure prediction.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM064633-02
Application #
6743692
Study Section
Special Emphasis Panel (ZRG1-SSS-H (90))
Program Officer
Edmonds, Charles G
Project Start
2003-05-01
Project End
2007-04-30
Budget Start
2004-05-01
Budget End
2005-04-30
Support Year
2
Fiscal Year
2004
Total Cost
$258,234
Indirect Cost
Name
Columbia University (N.Y.)
Department
Biochemistry
Type
Schools of Medicine
DUNS #
621889815
City
New York
State
NY
Country
United States
Zip Code
10032
Ofran, Yanay; Mysore, Venkatesh; Rost, Burkhard (2007) Prediction of DNA-binding residues from sequence. Bioinformatics 23:i347-53
Ofran, Yanay; Rost, Burkhard (2007) Protein-protein interaction hotspots carved into sequences. PLoS Comput Biol 3:e119
Punta, Marco; Forrest, Lucy R; Bigelow, Henry et al. (2007) Membrane protein prediction methods. Methods 41:460-74
Ofran, Yanay; Rost, Burkhard (2007) ISIS: interaction sites identified from sequence. Bioinformatics 23:e13-6
Ofran, Yanay; Yachdav, Guy; Mozes, Eyal et al. (2006) Create and assess protein networks through molecular characteristics of individual proteins. Bioinformatics 22:e402-7
Passerini, Andrea; Punta, Marco; Ceroni, Alessio et al. (2006) Identifying cysteines and histidines in transition-metal-binding sites using support vector machines and neural networks. Proteins 65:305-16
Schlessinger, Avner; Ofran, Yanay; Yachdav, Guy et al. (2006) Epitome: database of structure-inferred antigenic epitopes. Nucleic Acids Res 34:D777-80
Schlessinger, Avner; Yachdav, Guy; Rost, Burkhard (2006) PROFbval: predict flexible and rigid residues in proteins. Bioinformatics 22:891-3
Punta, Marco; Rost, Burkhard (2005) Protein folding rates estimated from contact predictions. J Mol Biol 348:507-12
Grana, Osvaldo; Baker, David; MacCallum, Robert M et al. (2005) CASP6 assessment of contact prediction. Proteins 61 Suppl 7:214-24

Showing the most recent 10 out of 14 publications