This project's main objective is to develop computerized tools that assist x-ray crystallographers in rapidly determining the three-dimensional structure of a protein. More specifically, this project addresses the following task: given a 3D electron-density map from crystallography and the sequence of the protein, find the most likely layout (i.e. """"""""trace"""""""") of the protein sequence in 3D. The project will create both automated methods based on statistical machine-learning and computer-vision techniques, as well as visualization tools that support humans doing this layout. These two approaches complement each other and are synergistic. This project's first specific aim is to develop and empirically evaluate algorithms that interpret crystallographic electron-density maps. The second specific aim is to incorporate structural-biology domain knowledge (secondary-structure prediction and potential-energy calculations) into the project's algorithms for interpreting density maps. The third specific aim is to tightly integrate partial model-construction with phase estimation updates to improve the recognition of 3D protein structures in x-ray reflection data;crystallographers will be able to intervene whenever they desire to help """"""""steer"""""""" this iterative process. The final specific aim is to develop intuitive and effective modalities - including virtual reality and the use of speech/audio - for the efficient use of crystallographer's time in manual model fitting and validation. Structural biology has wide relevance to biomedicine, since protein function generally follows from protein form (i.e., its structure). This project's techniques will speed-up the process of determining protein 3D structures, especially from low-quality (i.e., low-resolution) x-ray data, and will be applicable to other structural-biology tasks. Being able to accurately interpret low-resolution data promises to allow higher through put structure determination. The broader impact will include a better understanding of the power of modern theories and algorithms in machine learning and visualization in solving biological problems.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Research Project (R01)
Project #
5R01LM008796-04
Application #
7599114
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Ye, Jane
Project Start
2006-04-01
Project End
2012-03-31
Budget Start
2009-04-01
Budget End
2012-03-31
Support Year
4
Fiscal Year
2009
Total Cost
$306,728
Indirect Cost
Name
University of Wisconsin Madison
Department
Biostatistics & Other Math Sci
Type
Schools of Medicine
DUNS #
161202122
City
Madison
State
WI
Country
United States
Zip Code
53715
Yennamalli, Ragothaman; Arangarasan, Raj; Bryden, Aaron et al. (2014) Using a commodity high-definition television for collaborative structural biology. J Appl Crystallogr 47:1153-1157
Burgie, Sethe E; Bingman, Craig A; Soni, Ameet B et al. (2012) Structural characterization of human Uch37. Proteins 80:649-54
Soni, Ameet; Shavlik, Jude (2012) Probabilistic ensembles for improved inference in protein-structure determination. J Bioinform Comput Biol 10:1240009
DiMaio, Frank P; Soni, Ameet B; Phillips Jr, George N et al. (2009) Spherical-harmonic decomposition for molecular recognition in electron-density maps. Int J Data Min Bioinform 3:205-27
DiMaio, Frank; Kondrashov, Dmitry A; Bitto, Eduard et al. (2007) Creating protein models from electron-density maps using particle-filtering methods. Bioinformatics 23:2851-8
DiMaio, Frank; Shavlik, Jude; Phillips, George N (2006) A probabilistic approach to protein backbone tracing in electron density maps. Bioinformatics 22:e81-9