Variations in protein binding preferences are a critical barrier to the precision treatment of disease. When high resolution structures of a protein are available, and many isoforms of the protein have been connected to dif- fering binding preferences, it is possible in principle to model the structures of all isoforms and discover the mechanisms that cause variations in binding preferences. Unfortunately, this discovery process depends on human expertise for examining molecular structure, and given that hundreds of isoforms may exist, a human would be overwhelmed to objectively examine many similar isoforms. To fill this gap, this project will (A1) de- velop software that identifies structural mechanisms that cause differential binding preferences, categorizes similar structural mechanisms, and explains the mechanisms in English.
The second aim of this project (A2) is to validate the software at a large scale on families of proteins that exhibit a variety of well-examined binding preferences, and through blind predictions with experimental collaborators. Our approach involves creating software that mimics the visual reasoning techniques employed by structural biologists when examining molecular structures. Not only are these techniques responsible for most major dis- coveries in structural biology, but they are also straightforward to understand by non-computational research- ers. This property will enable our software to immediately integrate into existing workflows at labs that do not focus on computational methods. This property also contrasts from existing methods, which generally output structural models, potential energies, p-values and structural scores which are difficult for non-experts to un- derstand or incorporate into their research. Often, an expert in biophysics is required to interpret the outputs so that they can be operationalized in laboratory environments. In preliminary results, our methods have already identified molecular mechanisms that govern specificity in several families of proteins. Verification against peer-reviewed experimentation has proven the preliminary results correct in almost all cases. Our methods have also been applied to make a blind prediction of binding mechanisms in the ricin toxin, which binds to and damages the human ribosome. With experimental collabo- rators, we showed that our methods correctly identified and predicted the roles of several amino acids with a hitherto unknown role in recognizing the ribosome. Using our methodological approach and our rigorous valida- tion strategy, this project will produce a highly validated, usable software package that will bridge a critical gap in the development of precision therapies and diagnostics.
Variations in protein binding preferences are a critical barrier to precision medicine and precise diagnostics. We will develop software that will identify and categorize molecular mechanisms that cause these variations. The resulting insights will enable clinicians to more precisely select therapies to achieve superior outcomes.