In the structural genomics era, there is a need to extract and represent 3D shapes of protein binding sites, in order to reuse the information in a robust and simple manner. The long term objective of this proposal is the development of a set of computational algorithms and a database for using local surface shape signatures of proteins to predict function of proteins, and for protein-protein docking prediction. Storing precalculated binding sites in a database allows fast screening and comparison. Algorithms for identifying, representing, comparing, clustering, and docking local surface shape signatures of proteins will be developed. To identify binding sites, a visibility based algorithm will be used, which can identify both cavity and protrusion regions. To represent identified binding sites, three hierarchical levels of representation are proposed. The simplest representation scheme uses feature points which capture global or local maximum/minimum mean curvatures, the radius, and depth of a binding site. The second level of the representation uses a histogram-based method, capturing relative distances between feature points of a binding site. The last representation employs a voxelization method. The database of binding sites will employ R-tree based multidimensional indexes to allow real-time screening and clustering of binding sites. It is also planned to use a Self-Organizing Map for clustering, which allows dynamic updating of clusters. Pre-calculated hierarchical clusters by the R-tree or a Self-Organizing Map provide a framework for dynamic clustering displayed by a zoomable user interface. The fast geometric hashing-based protein-protein docking algorithm is developed, which uses precalculated binding sites to reduce the search space. A new invariant basis will be used in the hashing step to reduce the complexity of the algorithm from O(n3) to O(n2). A novel non-uniform hashing table is used in the hashing, which is tolerant to small errors or changes of parameters. The methodology will be extended to be able to handle predicted structures with possible errors. The active site identification methods will be applied to predict function of protein structures of unknown function determined by structural genomics projects. The docking algorithm will be applied to protein-protein interaction data of E.coli.
Wei, Qing; La, David; Kihara, Daisuke (2017) BindML/BindML+: Detecting Protein-Protein Interaction Interface Propensity from Amino Acid Substitution Patterns. Methods Mol Biol 1529:279-289 |
Fang, Yi; Sun, Mengtian; Dai, Guoxian et al. (2016) The Intrinsic Geometric Structure of Protein-Protein Interaction Networks for Protein Interaction Prediction. IEEE/ACM Trans Comput Biol Bioinform 13:76-85 |
Esquivel-RodrÃguez, Juan; Xiong, Yi; Han, Xusi et al. (2015) Navigating 3D electron microscopy maps with EM-SURFER. BMC Bioinformatics 16:181 |
Otsuka, Yuta; Muto, Ai; Takeuchi, Rikiya et al. (2015) GenoBase: comprehensive resource database of Escherichia coli K-12. Nucleic Acids Res 43:D606-17 |
Xiong, Yi; Esquivel-Rodriguez, Juan; Sael, Lee et al. (2014) 3D-SURFER 2.0: web platform for real-time search and characterization of protein surfaces. Methods Mol Biol 1137:105-17 |
Esquivel-Rodriguez, Juan; Filos-Gonzalez, Vianney; Li, Bin et al. (2014) Pairwise and multimeric protein-protein docking using the LZerD program suite. Methods Mol Biol 1137:209-34 |
Radivojac, Predrag; Clark, Wyatt T; Oron, Tal Ronnen et al. (2013) A large-scale evaluation of computational protein function prediction. Nat Methods 10:221-7 |
Chitale, Meghana; Khan, Ishita K; Kihara, Daisuke (2013) In-depth performance evaluation of PFP and ESG sequence-based function prediction methods in CAFA 2011 experiment. BMC Bioinformatics 14 Suppl 3:S2 |
Esquivel-Rodriguez, Juan; Kihara, Daisuke (2013) Computational methods for constructing protein structure models from 3D electron microscopy maps. J Struct Biol 184:93-102 |
La, David; Kong, Misun; Hoffman, William et al. (2013) Predicting permanent and transient protein-protein interfaces. Proteins 81:805-18 |
Showing the most recent 10 out of 56 publications