The broad goal is to develop and apply computational methods for building data-derived models of the structure and dynamics of proteins and their assemblies. These models can give insights into how the assemblies work, how they evolved, how they can be controlled, and how similar functionality can be designed. One successful approach, integrative structure modeling, casts the building of such models as a computational optimization problem where all knowledge about the assembly is encoded into the scoring function used to evaluate candidate models. It is proposed here to extend and enhance the open source Integrative Modeling Platform (IMP; that provides programmatic support for developing and distributing integrative structure modeling protocols. IMP allows representation of molecules at a variety of resolutions, use of scoring functions based on many types of data, and searches for solutions by a variety of sampling algorithms. In addition, IMP is easily extensible to add support for new data sources and algorithms, and is distributed under an open source license, with more than 300 unique downloads since March 2010. So far, it has been applied mostly to data from electron microscopy, small angle X-ray scattering, and various proteomics methods. The package will be extended to allow addressing a greater range of biological problems and to make it more generally useful to the scientific community. Specifically, the traditional scoring functions used by IMP will be supplemented with inference-based scoring functions that extract the maximum possible information from the data. The formulation of these functions will follow a Bayesian approach with minimal assumptions and approximations, to account for errors and incompleteness in the data as well as a heterogeneous sample. Sampling of the scoring function landscape will be improved by a method that efficiently divides the complete set of degrees of freedom into potentially overlapping subsets, finds optimal and suboptimal solutions for the subsets independently by traditional optimizers or enumeration, and then combines compatible solutions to obtain guaranteed best-scoring solutions for the whole system. IMP will also be extended to make best use of the wealth of information provided by mass spectrometry. To maximize the impact of IMP and its utility to the community, it will be interfaced with other packages, including structure viewers such as Chimera, structure prediction and design programs such as Rosetta, and web portals such as the Protein Model Portal. Finally, the software will be well-tested and documented, and the growing IMP community will be supported with mailing lists, examples, demonstrations at workshops, and hosting of select users at UCSF.

Public Health Relevance

Project Narrative We propose to extend IMP, a computer program that can describe the three-dimensional shapes of large macromolecular machines that are not amenable to solution with a single experimental technique. These structures will allow us to better understand the workings of the cell, both under normal and disease conditions.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Lyster, Peter
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California San Francisco
Schools of Pharmacy
San Francisco
United States
Zip Code
Webb, Benjamin; Lasker, Keren; Velazquez-Muriel, Javier et al. (2014) Modeling of proteins and their assemblies with the Integrative Modeling Platform. Methods Mol Biol 1091:277-95
Spill, Yannick G; Kim, Seung Joong; Schneidman-Duhovny, Dina et al. (2014) SAXS Merge: an automated statistical method to merge SAXS profiles using Gaussian processes. J Synchrotron Radiat 21:203-8
Molnar, Kathleen S; Bonomi, Massimiliano; Pellarin, Riccardo et al. (2014) Cys-scanning disulfide crosslinking and bayesian modeling probe the transmembrane signaling mechanism of the histidine kinase, PhoQ. Structure 22:1239-51
Bonomi, Massimiliano; Pellarin, Riccardo; Kim, Seung Joong et al. (2014) Determining protein complex structures based on a Bayesian model of in vivo Förster resonance energy transfer (FRET) data. Mol Cell Proteomics 13:2812-23
Erzberger, Jan P; Stengel, Florian; Pellarin, Riccardo et al. (2014) Molecular architecture of the 40S?eIF1?eIF3 translation initiation complex. Cell 158:1123-35
Schneidman-Duhovny, Dina; Pellarin, Riccardo; Sali, Andrej (2014) Uncertainty in integrative structural modeling. Curr Opin Struct Biol 28:96-104
Kim, Seung Joong; Fernandez-Martinez, Javier; Sampathkumar, Parthasarathy et al. (2014) Integrative structure-function mapping of the nucleoporin Nup133 suggests a conserved mechanism for membrane anchoring of the nuclear pore complex. Mol Cell Proteomics 13:2911-26
Algret, Romain; Fernandez-Martinez, Javier; Shi, Yi et al. (2014) Molecular architecture and function of the SEA complex, a modulator of the TORC1 pathway. Mol Cell Proteomics 13:2855-70
Shi, Yi; Fernandez-Martinez, Javier; Tjioe, Elina et al. (2014) Structural characterization by cross-linking reveals the detailed architecture of a coatomer-related heptameric module from the nuclear pore complex. Mol Cell Proteomics 13:2927-43
Zeng-Elmore, Xiaohui; Gao, Xiong-Zhuo; Pellarin, Riccardo et al. (2014) Molecular architecture of photoreceptor phosphodiesterase elucidated by chemical cross-linking and integrative modeling. J Mol Biol 426:3713-28

Showing the most recent 10 out of 38 publications