Computational prediction of protein structure from the amino acid sequence is one of the most important and challenging problems in bioinformatics and computational biology. With the exponential growth of protein sequences without solved protein structures in the post-genomic era, accurate protein structure prediction methods and tools are in urgent need. Here, we propose to develop an integrated approach to advance protein structure prediction at the 1-dimensional (1D), 2-dimensional (2D) and 3-dimensional (3D) levels. At the 1D level, novel information such as domain evolution signals, alternative gene splicing sites, and 2D protein contact map will be used to predict protein domain boundaries from the sequences. At the 2D level, new methods such as residue contact propagation, machine learning boosting, linear programming, and Markov Chain Monte Carlo simulations will be used to advance residue-residue contact prediction for a domain, or a protein. At the 3D level, 2D contact prediction, fold recognition via machine learning, and multi-template combination will be used to enhance both template-based and ab initio structure prediction. Finally, knowledge-based statistical machine learning methods and model combination algorithms will be developed to reliably evaluate and refine the quality of predicted protein structural models. One of several innovative aspects of this approach is to integrate 1D, 2D, and 3D predictions in order to improve each other through protein structural unit - domains. The 1D, 2D, and 3D protein structure prediction methods will be implemented as user-friendly software packages and web services released to the scientific community. These tools and web services will be useful for protein structure prediction, structure determination, functional analysis, protein engineering, protein mutagenesis analysis, and protein design.
The project will develop accurate computational methods and tools for basic biomedical research such as protein structure prediction, protein function analysis, protein design, protein engineering, and structure-based drug design.
|Korasick, David A; White, Tommi A; Chakravarthy, Srinivas et al. (2018) NAD+ promotes assembly of the active tetramer of aldehyde dehydrogenase 7A1. FEBS Lett 592:3229-3238|
|Adhikari, Badri; Hou, Jie; Cheng, Jianlin (2018) DNCON2: improved protein contact prediction using two-level deep convolutional neural networks. Bioinformatics 34:1466-1472|
|Hou, Jie; Adhikari, Badri; Cheng, Jianlin (2018) DeepSF: deep convolutional neural network for mapping protein sequences to folds. Bioinformatics 34:1295-1303|
|Liu, Li-Kai; Tanner, John J (2018) Crystal Structure of Aldehyde Dehydrogenase 16 Reveals Trans-Hierarchical Structural Similarity and a New Dimer. J Mol Biol :|
|Adhikari, Badri; Cheng, Jianlin (2018) CONFOLD2: improved contact-driven ab initio protein structure modeling. BMC Bioinformatics 19:22|
|Korasick, David A; Kon?itíková, Radka; Kope?ná, Martina et al. (2018) Structural and Biochemical Characterization of Aldehyde Dehydrogenase 12, the Last Enzyme of Proline Catabolism in Plants. J Mol Biol :|
|Adhikari, Badri; Hou, Jie; Cheng, Jianlin (2018) Protein contact prediction by integrating deep multiple sequence alignments, coevolution and machine learning. Proteins 86 Suppl 1:84-96|
|Keasar, Chen; McGuffin, Liam J; Wallner, Björn et al. (2018) An analysis and evaluation of the WeFold collaborative for protein structure prediction and its pipelines in CASP11 and CASP12. Sci Rep 8:9939|
|Korasick, David A; Wyatt, Jesse W; Luo, Min et al. (2017) Importance of the C-Terminus of Aldehyde Dehydrogenase 7A1 for Oligomerization and Catalytic Activity. Biochemistry 56:5910-5919|
|Cao, Renzhi; Adhikari, Badri; Bhattacharya, Debswapna et al. (2017) QAcon: single model quality assessment using protein structural and contact information with machine learning techniques. Bioinformatics 33:586-588|
Showing the most recent 10 out of 77 publications