NMR chemical shifts provide important local structural information for proteins. Consistent structure generation from NMR chemical shift data has recently become feasible for proteins with sizes of up to 130 residues, and such structures are of a quality comparable to those obtained with the standard NMR protocol. This study investigates the influence of the completeness of chemical shift assignments on structures generated from chemical shifts. The Chemical-Shift-Rosetta (CS-Rosetta) protocol was used for de novo protein structure generation with various degrees of completeness of the chemical shift assignment, simulated by omission of entries in the experimental chemical shift data previously used for the initial demonstration of the CS-Rosetta approach. In addition, a new CS-Rosetta protocol is described that improves robustness of the method for proteins with missing or erroneous NMR chemical shift input data. This strategy, which uses traditional Rosetta for pre-filtering of the fragment selection process, is demonstrated for two paramagnetic proteins and also for two proteins with solid-state NMR chemical shift assignments. NMR chemical shifts in proteins depend strongly on local structure. The program TALOS establishes an empirical relation between 13C, 15N and 1H chemical shifts and backbone torsion angles phi and psi (G. Cornilescu et al. J. Biomol. NMR. 13, 289-302, 1999). Extension of the original 20-protein database to 200 proteins increased the fraction of residues for which backbone angles could be predicted from 65 to 74%, while reducing the error rate from three to two percent. Addition of a two-layer neural network filter results in a new program, TALOS+, which further enhances the prediction rate to 88%, with a small reduction in error rate. Excluding the 2% of residues for which TALOS makes predictions that strongly differ from those observed in the crystalline state, the accuracy of predicted phi and psi angles, equals 13. Large discrepancies between predictions and crystal structures are primarily limited to loop regions, and for the few cases where multiple X-ray structures are available such residues are often found in different states in the different structures. The TALOS+ output includes predictions for individual residues with missing chemical shifts, and the neural network component of the program also predicts secondary structure with good accuracy.
Showing the most recent 10 out of 31 publications