Our current understanding of the molecular mechanisms of disease and structure-based design of drugs for treatment, rely on experimentally determined 3D structures of proteins and other macromolecules complexed with small molecule ligands. Many of these structures have direct relevance to public health, especially complexes of drug targets with drugs, inhibitors, substrates, or allosteric effectors. Yet, structure-based drug discovery is severely complicated and hindered by experimental bias and the shortcomings of current methods of experimental ligand identification, which often result in misidentified, missing, or misplaced ligands. The propagation of erroneous structures combined with an increased accessibility to structural data not only thwarts reproducibility in biomedical research and drug discovery, but also diverts valuable resources down doomed research avenues. We will leverage our extensive experience validating and refining ligand binding sites to generate ligand reference libraries that will be made publically available on a new web resource dedicated to the interaction of small molecules and macromolecules. These libraries can be used in many downstream applications, such as drug design, computational chemistry, biology, and bioinformatics. We will utilize recent technological advances in machine learning in conjunction with existing tools to create a standardized protocol for density interpretation and unbiased, reproducible ligand identification. This pipeline will not only be able identify and model ligands in unassigned density fragments, but also be able to detect and correct suboptimally refined ligands in existing structures. As the proposed AI will be free from cognitive bias, it should alleviate the most severe problems in structure-based drug design. Because improperly interpreted structures can have a significant deleterious ripple effect, we will experimentally verify select biomedically important structures with dubious experimental support for critical small molecules using use X-ray crystallography or electron microscopy.

Public Health Relevance

This proposal addresses current shortcomings of ligand identification in experimentally determined structures through a combination of novel machine learning algorithms and existing validation mechanisms. The main deliverables are 1) curated libraries of validated ligand binding sites and the tools used to produce the libraries and 2) a machine learning assisted pipeline for the interpretation of density fragments corresponding to small molecules within macromolecular structures. Some biologically important ligand binding sites with ambiguous, incomplete, or no experimental data will be experimentally verified using X-ray crystallography or electron microscopy.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM132595-02
Application #
10019572
Study Section
Macromolecular Structure and Function D Study Section (MSFD)
Program Officer
Lyster, Peter
Project Start
2019-09-17
Project End
2023-06-30
Budget Start
2020-07-01
Budget End
2021-06-30
Support Year
2
Fiscal Year
2020
Total Cost
Indirect Cost
Name
University of Virginia
Department
Physiology
Type
Schools of Medicine
DUNS #
065391526
City
Charlottesville
State
VA
Country
United States
Zip Code
22904