The proposal "The Crystallography of Macromolecules" addresses the limitations of diffraction data analysis methods in the field of X-ray crystallography. The significance of this work is determined by the importance of the technique, which generates uniquely-detailed information about cellular processes at the atomic level. The structural results obtained with crystallography are used to explain and validate results obtain by other biophysical, biochemical and cell biology techniques, to generate hypotheses for detailed studies of cellular process and to guide drug design studies - all of which are highly relevant to NIH mission. The proposal focuses on method development to address a frequent situation, where the crystal size and order is insufficient to obtain a structure from a single crystal. This is particularly frequent in cases of large eukaryotic complexes and membrane proteins, where the structural information is the most valuable to the NIH mission. The diffraction power of a single crystal is directly related to the microscopic order and size of that specimen. It is also one of the main correlates of structure solution success. The method used to solve the problem of data insufficiency in the case of a single crystal is to use multiple crystals and to average data between them, which allows to retrieve even very low signals. However, different crystals of the same protein, even if they are very similar i.e. have the same crystal lattice symmetry and very similar unit cell dimensions, still are characterized by a somewhat different order. This non-isomorphism is often high enough to make their solution with averaged data impossible. Moreover, the use of multiple data sets complicates decision making as each of the datasets contains different information and it is not clear when and how to combine them. The proposed solution relies on hierarchical analysis. First, the shape of the diffraction spot profiles will be modeled using a novel approach (Aim 1). This will form the ground for the next step, in which deconvolution of overlapping Bragg spot profiles from multiple lattices will be achieved (Aim 2). An additional benefit of algorithms developed in Aim 1 is that they will automatically derive the integration parameters and identify artifacts, making the whole process more robust. This is particularly significant for high-throughput and multiple crystal analysis.
In Aim 3, comparison of data from multiple crystals will be performed to identify subsets of data that should be merged to produce optimal results. The critical aspect of this analysis will be the identification and assessment of non- isomorphism between datasets. The experimental decision-making strategy is the subject of Aim 4. The Support Vector Machine (SVM) method will be used to evaluate the suitability of available datasets for possible methods of structure solution. In cases of insufficient data it will identify the most significant factor that needs to be improved.
Aim 5 is to simplify navigation of data reduction and to integrate the results of previous aims with other improvements in hardware and computing.

Public Health Relevance

The goal of the proposal is to develop methods for analysis of X-ray diffraction data with a particular focus on the novel analysis of diffraction spot shape and the streamlining of data analysis in multi-crystal modes. The development of such methods is essential to advance structural studies in thousands of projects, which individually are important for NIH mission.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM053163-20
Application #
8657051
Study Section
Macromolecular Structure and Function D Study Section (MSFD)
Program Officer
Edmonds, Charles G
Project Start
1996-05-01
Project End
2015-04-30
Budget Start
2014-05-01
Budget End
2015-04-30
Support Year
20
Fiscal Year
2014
Total Cost
$320,096
Indirect Cost
$75,630
Name
University of Texas Sw Medical Center Dallas
Department
Biochemistry
Type
Schools of Medicine
DUNS #
800771545
City
Dallas
State
TX
Country
United States
Zip Code
75390
Özkan, Engin; Chia, Poh Hui; Wang, Ruiqi Rachel et al. (2014) Extracellular architecture of the SYG-1/SYG-2 adhesion complex instructs synaptogenesis. Cell 156:482-94
Zimmerman, Matthew D; Grabowski, Marek; Domagalski, Marcin J et al. (2014) Data management in the modern structural biology and biomedical research environment. Methods Mol Biol 1140:1-25
Rashin, Alexander A; Domagalski, Marcin J; Zimmermann, Michael T et al. (2014) Factors correlating with significant differences between X-ray structures of myoglobin. Acta Crystallogr D Biol Crystallogr 70:481-91
Domagalski, Marcin J; Zheng, Heping; Zimmerman, Matthew D et al. (2014) The quality and validation of structures from structural genomics. Methods Mol Biol 1091:297-314
Majorek, Karolina A; Kuhn, Misty L; Chruszcz, Maksymilian et al. (2014) Double trouble-Buffer selection and His-tag presence may be responsible for nonreproducibility of biomedical experiments. Protein Sci 23:1359-68
Offermann, Lesa R; Chan, Siew Leong; Osinski, Tomasz et al. (2014) The major cockroach allergen Bla g 4 binds tyramine and octopamine. Mol Immunol 60:86-94
Orlikowska, Marta; Szymanska, Aneta; Borek, Dominika et al. (2013) Structural characterization of V57D and V57P mutants of human cystatin C, an amyloidogenic protein. Acta Crystallogr D Biol Crystallogr 69:577-86
Ouyang, Zhuqing; Zheng, Ge; Song, Jianhua et al. (2013) Structure of the human cohesin inhibitor Wapl. Proc Natl Acad Sci U S A 110:11355-60
Chruszcz, Maksymilian; Ciardiello, Maria Antonietta; Osinski, Tomasz et al. (2013) Structural and bioinformatic analysis of the kiwifruit allergen Act d 11, a member of the family of ripening-related proteins. Mol Immunol 56:794-803
Barker, Megan K; Rose, David R (2013) Specificity of Processing *-glucosidase I is guided by the substrate conformation: crystallographic and in silico studies. J Biol Chem 288:13563-74

Showing the most recent 10 out of 47 publications