In the structural genomics era, there is a need to extract and represent 3D shapes of protein binding sites, in order to reuse the information in a robust and simple manner. The long term objective of this proposal is the development of a set of computational algorithms and a database for using local surface shape signatures of proteins to predict function of proteins, and for protein-protein docking prediction. Storing precalculated binding sites in a database allows fast screening and comparison. Algorithms for identifying, representing, comparing, clustering, and docking local surface shape signatures of proteins will be developed. To identify binding sites, a visibility based algorithm will be used, which can identify both cavity and protrusion regions. To represent identified binding sites, three hierarchical levels of representation are proposed. The simplest representation scheme uses feature points which capture global or local maximum/minimum mean curvatures, the radius, and depth of a binding site. The second level of the representation uses a histogram-based method, capturing relative distances between feature points of a binding site. The last representation employs a voxelization method. The database of binding sites will employ R-tree based multidimensional indexes to allow real-time screening and clustering of binding sites. It is also planned to use a Self-Organizing Map for clustering, which allows dynamic updating of clusters. Pre-calculated hierarchical clusters by the R-tree or a Self-Organizing Map provide a framework for dynamic clustering displayed by a zoomable user interface. The fast geometric hashing-based protein-protein docking algorithm is developed, which uses precalculated binding sites to reduce the search space. A new invariant basis will be used in the hashing step to reduce the complexity of the algorithm from O(n3) to O(n2). A novel non-uniform hashing table is used in the hashing, which is tolerant to small errors or changes of parameters. The methodology will be extended to be able to handle predicted structures with possible errors. The active site identification methods will be applied to predict function of protein structures of unknown function determined by structural genomics projects. The docking algorithm will be applied to protein-protein interaction data of E.coli.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM075004-02
Application #
7125028
Study Section
Special Emphasis Panel (ZRG1-BDMA (01))
Program Officer
Li, Jerry
Project Start
2005-09-23
Project End
2010-08-31
Budget Start
2006-09-01
Budget End
2007-08-31
Support Year
2
Fiscal Year
2006
Total Cost
$296,458
Indirect Cost
Name
Purdue University
Department
Biology
Type
Schools of Arts and Sciences
DUNS #
072051394
City
West Lafayette
State
IN
Country
United States
Zip Code
47907
Wei, Qing; La, David; Kihara, Daisuke (2017) BindML/BindML+: Detecting Protein-Protein Interaction Interface Propensity from Amino Acid Substitution Patterns. Methods Mol Biol 1529:279-289
Fang, Yi; Sun, Mengtian; Dai, Guoxian et al. (2016) The Intrinsic Geometric Structure of Protein-Protein Interaction Networks for Protein Interaction Prediction. IEEE/ACM Trans Comput Biol Bioinform 13:76-85
Esquivel-Rodríguez, Juan; Xiong, Yi; Han, Xusi et al. (2015) Navigating 3D electron microscopy maps with EM-SURFER. BMC Bioinformatics 16:181
Otsuka, Yuta; Muto, Ai; Takeuchi, Rikiya et al. (2015) GenoBase: comprehensive resource database of Escherichia coli K-12. Nucleic Acids Res 43:D606-17
Xiong, Yi; Esquivel-Rodriguez, Juan; Sael, Lee et al. (2014) 3D-SURFER 2.0: web platform for real-time search and characterization of protein surfaces. Methods Mol Biol 1137:105-17
Esquivel-Rodriguez, Juan; Filos-Gonzalez, Vianney; Li, Bin et al. (2014) Pairwise and multimeric protein-protein docking using the LZerD program suite. Methods Mol Biol 1137:209-34
Esquivel-Rodriguez, Juan; Kihara, Daisuke (2013) Computational methods for constructing protein structure models from 3D electron microscopy maps. J Struct Biol 184:93-102
La, David; Kong, Misun; Hoffman, William et al. (2013) Predicting permanent and transient protein-protein interfaces. Proteins 81:805-18
Radivojac, Predrag; Clark, Wyatt T; Oron, Tal Ronnen et al. (2013) A large-scale evaluation of computational protein function prediction. Nat Methods 10:221-7
Chitale, Meghana; Khan, Ishita K; Kihara, Daisuke (2013) In-depth performance evaluation of PFP and ESG sequence-based function prediction methods in CAFA 2011 experiment. BMC Bioinformatics 14 Suppl 3:S2

Showing the most recent 10 out of 56 publications