Automating the determination of macromolecular structures by X-ray crystallography is crucial for producing new advances and discoveries in structural biology. Many of the preliminary steps, such as crystallization and data collection, have benefitted from recent advances in technology, and development of new computational techniques (and faster computers) has made the refinement of phases and generation of electron density maps more efficient and accurate. However, the final step of interpreting the electron density map and constructing a model with atomic coordinates has resisted automation and remains one of the primary bottlenecks to streamlining large-scale structural genomics projects. We propose a novel approach to automated model-building based on the principles of pattern recognition, specifically by searching for matching regions with similar patterns of density in a database of previously solved maps, using calculated numerical features. This approach has been implemented in an automated model-building system called TEXTAL, and preliminary results show that it is capable of predicting local molecular structures (e.g. side-/main-chain atoms of a residue) that can be assembled to form global models with coordinate RMS errors in the range of 0.75A, when initial locations of C-alpha atoms are precisely known. In this proposal, we hypothesize that 1) pattern recognition can also be used to accurately identify the locations of C-alpha atoms in a map, 2) that local models can be improved by exploiting information from nearby regions, such as via amino acid identity, main chain direction, secondary structure, and tertiary, and tertiary contacts, and 3) the models output by TEXTAL can be improved through a combination of real-space and reciprocal-space refinement. In addition, as part of the program project, we will develop an interface for TEXTAL as a software component in PHENIX, the proposed integrated crystallography system written in the Python scripting language. Finally, one of the new challenges presented by this integrated system is how to use the various components in an efficient and effective way to solve structures from datasets automatically. This involves complex decision-making under certainty to decide which programs to run (and parameters to set) that will most likely lead to a solution of high quality. We propose to implement an intelligent decision-making algorithm to use within PHENIX based on decision theory in intelligent agents, i.e. selecting actions that maximize expected utility. Such an approach is a necessary step toward fully utilizing the expanded capabilities of an integrated system for automated structure determination, by capturing the flexible decision-making process human crystallographers use in the overall process of solving structures.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Program Projects (P01)
Project #
1P01GM063210-01
Application #
6485351
Study Section
Special Emphasis Panel (ZRG1)
Project Start
2001-07-01
Project End
2006-06-30
Budget Start
Budget End
Support Year
1
Fiscal Year
2001
Total Cost
Indirect Cost
Name
Lawrence Berkeley National Laboratory
Department
Type
DUNS #
078576738
City
Berkeley
State
CA
Country
United States
Zip Code
94720
Richardson, Jane S; Williams, Christopher J; Hintze, Bradley J et al. (2018) Model validation: local diagnosis, correction and when to quit. Acta Crystallogr D Struct Biol 74:132-142
Herzik Jr, Mark A; Fraser, James S; Lander, Gabriel C (2018) A Multi-model Approach to Assessing Local and Global Cryo-EM Map Quality. Structure :
Kryshtafovych, Andriy; Monastyrskyy, Bohdan; Adams, Paul D et al. (2018) Distribution of evaluation scores for the models submitted to the second cryo-EM model challenge. Data Brief 20:1629-1638
Moriarty, Nigel W; Liebschner, Dorothee; Klei, Herbert E et al. (2018) Interactive comparison and remediation of collections of macromolecular structures. Protein Sci 27:182-194
Kryshtafovych, Andriy; Adams, Paul D; Lawson, Catherine L et al. (2018) Evaluation system and web infrastructure for the second cryo-EM model challenge. J Struct Biol 204:96-108
Terwilliger, Thomas C; Adams, Paul D; Afonine, Pavel V et al. (2018) Map segmentation, automated model-building and their application to the Cryo-EM Model Challenge. J Struct Biol 204:338-343
Williams, Christopher J; Headd, Jeffrey J; Moriarty, Nigel W et al. (2018) MolProbity: More and better reference data for improved all-atom structure validation. Protein Sci 27:293-315
Terwilliger, Thomas C; Adams, Paul D; Afonine, Pavel V et al. (2018) A fully automatic method yielding initial models from high-resolution cryo-electron microscopy maps. Nat Methods 15:905-908
Richardson, Jane S; Williams, Christopher J; Videau, Lizbeth L et al. (2018) Assessment of detailed conformations suggests strategies for improving cryoEM models: Helix at lower resolution, ensembles, pre-refinement fixups, and validation at multi-residue length scale. J Struct Biol 204:301-312
Zheng, Min; Moriarty, Nigel W; Xu, Yanting et al. (2017) Solving the scalability issue in quantum-based refinement: Q|R#1. Acta Crystallogr D Struct Biol 73:1020-1028

Showing the most recent 10 out of 136 publications