The goal of this R21/R33 proposal is to develop novel models, scoring schemes, and techniques based on the mini-threading approach for protein structure prediction. During the R21 phase, we will focus on the proof-of-principle development for our new methods. First, we will develop new statistical models and computational methods to identify useful fragments in PDB for a query protein. In particular, we will identify protein fragments of variable lengths in PDB according to statistically significant matches instead of limiting the fragments to 9-mers as practiced by existing methods. Second, besides angular restraints used in the current threading methods, we will formulate spatial restraints derived from the alignments between a query sequence and its fragment hits of known structures in Cartesian coordinates. Third, we will investigate new optimization problem formulations to build coarse-grain structural models. Specifically, we will tailor advanced optimization techniques, such as semidefinite programming and evolutionary algorithms, to find the efficient methods of assembling local structures. Fourth, we will evaluate confidence of predicted protein structures through clustering sampled conformations, correlated mutation, and neural networks. Fifth, we will build all-atom structural models for selected coarse-grain models, and further evaluate the models using properties of atomic structures under perturbation (e.g., high temperature or force). During the R33 phase, we will focus on the evaluation, refinement, extension and application of the methods developed during the R21 phase. First, we will perform large-scale evaluations of the methods, and we will refine the methods based on the evaluations and tests. Second, we will implement the methods as a stand-alone software package for public distribution and a Web server available for the public. Third, we will expand our methods to structure prediction of membrane proteins. Finally, we will apply the methods to selected proteins that have significant impact to human health, such as CFTR channels, proteins coded in the SARS genome, strabismus (stbm)/van Gogh (Vang) protein, ARC superfamily, etc. The new techniques may significantly increase the accuracy of the protein structure prediction whiling saving computing time. They will extend to membrane proteins, whose structures have understudied by major drug targets for many diseases. Our studies will shed some light on the structures and functions of a set of key human proteins, which may help researchers characterize disease genes and develop new treatment with substantial savings of resources. ? ? ?

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Exploratory/Developmental Grants Phase II (R33)
Project #
4R33GM078601-03
Application #
7648313
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Remington, Karin A
Project Start
2006-07-01
Project End
2011-06-30
Budget Start
2008-07-05
Budget End
2009-06-30
Support Year
3
Fiscal Year
2008
Total Cost
$218,735
Indirect Cost
Name
University of Missouri-Columbia
Department
Biostatistics & Other Math Sci
Type
Schools of Engineering
DUNS #
153890272
City
Columbia
State
MO
Country
United States
Zip Code
65211
Wang, Chao; Zhang, Haicang; Zheng, Wei-Mou et al. (2016) FALCON@home: a high-throughput protein structure prediction server based on remote homologue recognition. Bioinformatics 32:462-4
He, Zhiquan; Ma, Wenji; Zhang, Jingfen et al. (2015) A New Hidden Markov Model for Protein Quality Assessment Using Compatibility Between Protein Sequence and Structure. Tsinghua Sci Technol 19:559-567
He, Zhiquan; Zhang, Chao; Xu, Yang et al. (2014) MUFOLD-DB: a processed protein structure database for protein structure prediction and analysis. BMC Genomics 15 Suppl 11:S2
Yu, DongMei; Zhang, Chao; Qin, PeiWu et al. (2014) RNA-protein distance patterns in ribosomes reveal the mechanism of translational attenuation. Sci China Life Sci 57:1131-9
Zhang, Jingfen; Xu, Dong (2013) Fast algorithm for population-based protein structural model analysis. Proteomics 13:221-9
Wang, Dan; Shang, Yi (2013) Modeling Physiological Data with Deep Belief Networks. Int J Inf Educ Technol 3:505-511
Wang, Han; He, Zhiquan; Zhang, Chao et al. (2013) Transmembrane protein alignment and fold recognition based on predicted topology. PLoS One 8:e69744
Wang, Qingguo; Shang, Charles; Xu, Dong et al. (2013) NEW MDS AND CLUSTERING BASED ALGORITHMS FOR PROTEIN MODEL QUALITY ASSESSMENT AND SELECTION. Int J Artif Intell Tools 22:1360006
Gao, Jianjiong; Xu, Dong (2012) Correlation between posttranslational modification and intrinsic disorder in protein. Pac Symp Biocomput :94-103
Yao, Qiuming; Gao, Jianjiong; Bollinger, Curtis et al. (2012) Predicting and analyzing protein phosphorylation sites in plants using musite. Front Plant Sci 3:186

Showing the most recent 10 out of 22 publications