NMR chemical shifts provide important local structural information for proteins. Consistent structure generation from NMR chemical shift data has recently become feasible for proteins with sizes of up to 130 residues, and such structures are of a quality comparable to those obtained with the standard NMR protocol. In collaboration with Dr. David Baker and his group, we have previously developed a chemical-shift-guided approach to successfully and accurately determine structures on the basis of chemical shifts, for systems less than about 130 amino acids. New work focuses on extending this approach to allow incorporation of easily accessible experimental information. By means of an optimized neural network algorithm, SPARTA+, we are able to estimate chemical shifts for proteins of known structure. This in turn provides an important step towards finding fragments in the crystallographic structure database that are compatible in structure with fragments of proteins for which only NMR chemical shift assignments are available. Integration with the previously developed chemical shift Rosetta (CS-Rosetta) program shows significant performance enhancement. Other enhancements in the CS-Rosetta procedure itself make it suitable for determining the structure of homo-oligomeric proteins, as demonstrated for the catalytic core domain of HIV integrase. A new program, TALOS-N, has been developed for predicting protein backbone torsion angles from NMR chemical shifts. The program relies far more extensively on the use of trained artificial neural networks than its predecessor, TALOS+. Validation on an independent set of proteins indicates that backbone torsion angles can be predicted for a larger, ≥90% fraction of the residues, with an error rate smaller than ca 3.5%, using an acceptance criterion that is nearly two-fold tighter than that used previously, and a root mean square difference between predicted and crystallographically observed (phi,psi) torsion angles of ca 12. TALOS-N also reports sidechain chi1 rotameric states for about 50% of the residues, and a consistency with reference structures of 89%. The program includes a neural network trained to identify secondary structure from residue sequence and chemical shifts.
Showing the most recent 10 out of 31 publications