X-ray crystallography remains the workhorse of experimental techniques to obtain the 3-D structures of proteins. Deformable fragments in the protein, ranging in length from 5 to 20 amino-acids, often lead to disorder in the crystal, which makes interpretation of the electron density map difficult. At the same time, such mobile fragments can be critical to a protein's ability to bind to other molecules and thus achieve its function. This project will develop a new mathematical model for a redundant, closed, protein-like kinematic chain, and use techniques from algebraic geometry and differential topology, in particular Morse Theory, and probabilistic roadmap techniques from robotics to determine the structure of its configuration space (the self-motion manifold). Precise knowledge of the structure can be exploited to develop efficient algorithms capable of defining a probability distribution over all configurations, derived from the electron density map and an energetic model of the molecule. The objective of the program is to enable crystallographers to retrieve and study important, dynamic properties of the molecule, and furthermore to develop new algorithms for protein model building in areas of weak or ambiguous electron density. In a multimodal disordered case, the goal will be to identify all substates, along with their probabilities, and to reconstruct energetically plausible conformational pathways.
Proteins are the worker molecules that carry out life's essential processes. A protein's function is largely dictated by its folded, three-dimensional structure. X-ray crystallography is the most widely used experimental technique to obtain a protein's 3-D structure. Flexible fragments in the protein, vital for performing its function, are often poorly resolved in the data, and lead to structural models with gaps. This project will investigate the mathematical structure of the set of configurations of gap-closing fragments, and use these insights together with experimental data and energy models to associate a likelihood with each configuration. Algorithms will be developed to study important, dynamic properties of proteins, and to aid protein structure determination from experimental data. The research and developments described in this proposal will have a direct impact on medical research. Improved 3-D models will lead to a better understanding of a protein's function. An ability to infer the most likely configurations of a fragment, that is known to be of functional significance, may enhance structure-based drug design capabilities. It will furthermore contribute substantially to the Protein Structure Initiative's (PSI) broader mission of developing methods for automating crystallographic analysis. The program is a joint effort of the Department of Mathematics and the Computer Science Department at Stanford University, and the Joint Center for Structural Genomics (an NIGMS funded PSI center) at the Stanford Synchrotron Radiation Laboratory. The project will support the training of PhD level students at Stanford University. Computer software based on the research will be made available to the structural biology community.