There is an unmet need in medicine and basic sciences for accurate atomic structures of proteins. This need surpasses the capabilities of traditional high-resolution experimental methods. With machine learning advances, structure prediction algorithms are poised to provide atomic models for these areas in the near future. Yet, the gaps in prediction algorithms limit accuracy and reliability, particularly for large multi-domain proteins, protein complexes, and flexible proteins. Our proposal, Towards Accurate protein structure Predictions with SAXS TechnologY (TAPESTRY), will create technology to increase reliability and improve accuracy of protein structure predictions through experimental validation, particularly for difficult proteins. TAPESTRY is innovative by combining our strengths in high-throughput synchrotron SAXS (Small Angle X-ray Scattering) data collection and analysis with the Critical Assessment of protein Structure Prediction (CASP), which assesses structure predictions against ?gold standard?, not-yet-released crystal structures every two years. Through CASP, we take advantage of the collective protein folding knowledge of the global community of structure prediction scientists. Our approach is strategic. We provide SAXS data for CASP, giving prediction scientists access to experimental data. We develop analytical and experimental tools, designed for prediction scientists to overcome current gaps that limit the use of SAXS data. We test these tools against our TAPESTRY databases of standard proteins, with corresponding crystal structures, SAXS data, and predicted models. Finally, we evaluate the robustness of our technology through CASP and obtain an unbiased assessment of our tools and the state of the field. As a first step, we target well-folded proteins (Aim 1) and proteins with disordered tails (Aim 2) in this proposal. The feasibility of our technology proposal is supported by our current data and proofs-in-concepts, our beamline capabilities, and proven experience in SAXS analysis. We show that experimental SAXS data, which contains distance information that can act as restraints in protein structure prediction algorithms, match crystal structures of well-folded proteins and score predictions based on topological accuracy. We show cases in CASP13 (2018) when SAXS data improved the fold of predicted models. SAXS data collection is rapid (10 seconds), does not require labeling or crystallization, and is available at no cost to the scientific community. We have proven experience in developing informative and effective SAXS analytical tools. Our long-term goal is to enable biomedical researchers to input an amino acid sequence and rapidly obtain an experimentally validated and accurate atomic model(s) that reflects the protein conformation(s) in solution. If TAPESTRY is successful, the increased availability of such atomic models will have strong and broad potential to advance biomedical research and impact all areas of biology in which proteins are involved.
Atomic structures are critical for medical science from understanding how proteins function to drug design. Structure prediction algorithms could provide atomic models for these purposes in the near future. This proposal will leverage the structure-based technology, Small Angle X-ray Scattering (SAXS), to develop methods to experimentally validate these atomic models and increase their accuracy.