Protein structure mediates protein function and, ultimately, organismal behavior. A complement of computational and experimental approaches is necessary to determine structures for the large numbers of protein sequences available from whole genome sequencing projects. We propose a novel approach to integrate easily-obtained data from Nuclear Magnetic Resonance (NMR) experiments on proteins with our prediction methodologies to accurately model structures in a rapid manner. Specifically, our aims are to: 1. Automate secondary structure assignment using chemical shift, J-coupling, unassigned NOE data and sequence based algorithms. We will use neural networks to efficiently and accurately combine the different datasets. 2. Sample protein conformational space by translating secondary structure, chemical shift, J-coupling and database tendencies into backbone angle probability distributions. These distributions, generated using neural networks, will be used to bias the sample space explored by our de novo methods for a given protein sequence such that a large proportion of native-like conformations consistent with the input data are encountered. 3. Select the most native-like conformations by combining NMR data with existing statistical and physical functions. NMR scoring functions will be based on the similarity of backbone angles and simulated NOE spectra with the calculated probability distributions and the input NOE data. 4. Refine the quality of the conformational ensemble automatically assigning the NOE data to obtain non-local constraints. The simulated spectra from the best scoring conformations will be used to obtain an initial subset of constraints which will be incorporated into the generation of new conformations, thus iteratively assigning the NOE data and improving the quality of the conformations until a final set of structures fitting the input data is obtained. 5.Test the methods developed in a robust and unbiased manner. We will set up internal testing mechanisms that avoid bias to particular classes of proteins; evaluate components of predictions separately from whole predictions to identify those that work well and those that need further improvement; and perform continuous benchmarking of our methods 6. Enable NMR experimentalists to submit sequences for which we will make prediction using the methods described above. We will publish the software produced, and the information obtained, using database driven interfaces on the world wide web.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Exploratory/Developmental Grants Phase II (R33)
Project #
5R33GM068152-03
Application #
6888939
Study Section
Special Emphasis Panel (ZRG1-SSS-H (90))
Program Officer
Wehrle, Janna P
Project Start
2003-05-01
Project End
2006-04-30
Budget Start
2005-05-01
Budget End
2006-04-30
Support Year
3
Fiscal Year
2005
Total Cost
$315,134
Indirect Cost
Name
University of Washington
Department
Microbiology/Immun/Virology
Type
Schools of Medicine
DUNS #
605799469
City
Seattle
State
WA
Country
United States
Zip Code
98195
Horst, Jeremy A; Samudrala, Ram (2010) A protein sequence meta-functional signature for calcium binding residue prediction. Pattern Recognit Lett 31:2103-2112
Bernard, Brady; Samudrala, Ram (2009) A generalized knowledge-based discriminatory function for biomolecular interactions. Proteins 76:115-28
Guerquin, Michal; McDermott, Jason; Frazier, Zach et al. (2009) The Bioverse API and web application. Methods Mol Biol 541:511-34
Samudrala, Ram; Heffron, Fred; McDermott, Jason E (2009) Accurate prediction of secreted substrates and identification of a conserved putative secretion signal for type III secretion systems. PLoS Pathog 5:e1000375
Wang, Kai; Horst, Jeremy A; Cheng, Gong et al. (2008) Protein meta-functional signatures from combining sequence, structure, evolution, and amino acid property information. PLoS Comput Biol 4:e1000181
Liu, Tianyun; Guerquin, Michal; Samudrala, Ram (2008) Improving the accuracy of template-based predictions by mixing and matching between initial models. BMC Struct Biol 8:24
Ngan, Shing-Chung; Hung, Ling-Hong; Liu, Tianyun et al. (2008) Scoring functions for de novo protein structure prediction revisited. Methods Mol Biol 413:243-81
Jenwitheesuk, Ekachai; Horst, Jeremy A; Rivas, Kasey L et al. (2008) Novel paradigms for drug discovery: computational multitarget screening. Trends Pharmacol Sci 29:62-71
Jenwitheesuk, Ekachai; Samudrala, Ram (2007) Identification of potential HIV-1 targets of minocycline. Bioinformatics 23:2797-9
Oren, Ersin Emre; Tamerler, Candan; Sahin, Deniz et al. (2007) A novel knowledge-based approach to design inorganic-binding peptides. Bioinformatics 23:2816-22

Showing the most recent 10 out of 26 publications