New Scoring, Assembly and Evaulation Techiniques for Protein Structure Prediction

Xu, Dong

Abstract

The goal of this R21/R33 proposal is to develop novel models, scoring schemes, and techniques based on the mini-threading approach for protein structure prediction. During the R21 phase, we will focus on the proof-of-principle development for our new methods. First, we will develop new statistical models and computational methods to identify useful fragments in PDB for a query protein. In particular, we will identify protein fragments of variable lengths in PDB according to statistically significant matches instead of limiting the fragments to 9-mers as practiced by existing methods. Second, besides angular restraints used in the current threading methods, we will formulate spatial restraints derived from the alignments between a query sequence and its fragment hits of known structures in Cartesian coordinates. Third, we will investigate new optimization problem formulations to build coarse-grain structural models. Specifically, we will tailor advanced optimization techniques, such as semidefinite programming and evolutionary algorithms, to find the efficient methods of assembling local structures. Fourth, we will evaluate confidence of predicted protein structures through clustering sampled conformations, correlated mutation, and neural networks. Fifth, we will build all-atom structural models for selected coarse-grain models, and further evaluate the models using properties of atomic structures under perturbation (e.g., high temperature or force). During the R33 phase, we will focus on the evaluation, refinement, extension and application of the methods developed during the R21 phase. First, we will perform large-scale evaluations of the methods, and we will refine the methods based on the evaluations and tests. Second, we will implement the methods as a stand-alone software package for public distribution and a Web server available for the public. Third, we will expand our methods to structure prediction of membrane proteins. Finally, we will apply the methods to selected proteins that have significant impact to human health, such as CFTR channels, proteins coded in the SARS genome, strabismus (stbm)/van Gogh (Vang) protein, ARC superfamily, etc. The new techniques may significantly increase the accuracy of the protein structure prediction whiling saving computing time. They will extend to membrane proteins, whose structures have understudied by major drug targets for many diseases. Our studies will shed some light on the structures and functions of a set of key human proteins, which may help researchers characterize disease genes and develop new treatment with substantial savings of resources. ? ? ?

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Exploratory/Developmental Grants Phase II (R33)
Project #: 4R33GM078601-03
Application #: 7648313
Study Section: Biodata Management and Analysis Study Section (BDMA)
Program Officer: Remington, Karin A

Project Start: 2006-07-01
Project End: 2011-06-30
Budget Start: 2008-07-05
Budget End: 2009-06-30
Support Year: 3
Fiscal Year: 2008
Total Cost: $218,735
Indirect Cost

Institution

Name: University of Missouri-Columbia
Department: Biostatistics & Other Math Sci
Type: Schools of Engineering
DUNS #: 153890272

City: Columbia
State: MO
Country: United States
Zip Code: 65211

Related projects


NIH 2010 R33 GM	New Scoring, Assembly and Evaulation Techiniques for Protein Structure Prediction Xu, Dong / University of Missouri-Columbia	$219,712
NIH 2009 R33 GM	New Scoring, Assembly and Evaulation Techiniques for Protein Structure Prediction Xu, Dong / University of Missouri-Columbia	$220,339
NIH 2008 R33 GM	New Scoring, Assembly and Evaulation Techiniques for Protein Structure Prediction Xu, Dong / University of Missouri-Columbia	$218,735

Publications

Wang, Chao; Zhang, Haicang; Zheng, Wei-Mou et al. (2016) FALCON@home: a high-throughput protein structure prediction server based on remote homologue recognition. Bioinformatics 32:462-4

He, Zhiquan; Ma, Wenji; Zhang, Jingfen et al. (2015) A New Hidden Markov Model for Protein Quality Assessment Using Compatibility Between Protein Sequence and Structure. Tsinghua Sci Technol 19:559-567

He, Zhiquan; Zhang, Chao; Xu, Yang et al. (2014) MUFOLD-DB: a processed protein structure database for protein structure prediction and analysis. BMC Genomics 15 Suppl 11:S2

Yu, DongMei; Zhang, Chao; Qin, PeiWu et al. (2014) RNA-protein distance patterns in ribosomes reveal the mechanism of translational attenuation. Sci China Life Sci 57:1131-9

Zhang, Jingfen; Xu, Dong (2013) Fast algorithm for population-based protein structural model analysis. Proteomics 13:221-9

Wang, Dan; Shang, Yi (2013) Modeling Physiological Data with Deep Belief Networks. Int J Inf Educ Technol 3:505-511

Wang, Han; He, Zhiquan; Zhang, Chao et al. (2013) Transmembrane protein alignment and fold recognition based on predicted topology. PLoS One 8:e69744

Wang, Qingguo; Shang, Charles; Xu, Dong et al. (2013) NEW MDS AND CLUSTERING BASED ALGORITHMS FOR PROTEIN MODEL QUALITY ASSESSMENT AND SELECTION. Int J Artif Intell Tools 22:1360006

Zhang, Jingfen; He, Zhiquan; Wang, Qingguo et al. (2012) Prediction of protein tertiary structures using MUFOLD. Methods Mol Biol 815:3-13

Zhang, Chao; Hanspers, Kristina; Kuchinsky, Allan et al. (2012) Mosaic: making biological sense of complex networks. Bioinformatics 28:1943-4

Showing the most recent 10 out of 22 publications

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: