Macro-molecules, such as proteins and nucleic acids, are the major building blocks of the cell. Many macro-molecules perform their function by interacting with each other. Characterizing these interactions helps elucidate how living organisms function at the molecular level, contributes towards the development of treatments against diseases such as cancer and facilitates the design of novel bio-inspired materials. Detailed understanding of macro-molecular interaction mechanisms requires determining the three-dimensional structures of their complexes. These structures are very difficult to obtain using experimental techniques, thus, computational approaches, called macro-molecular docking, can be very useful. The investigator has developed fast and effective algorithms and software that, according to the worldwide evaluation experiment CAPRI (Critical Assessment of Predicted Interactions), are among the best for predicting the structures of protein-protein complexes. These methods have been implemented in the fully automated docking server ClusPro, which is free for academic use, and has over 18,000 regular users. However, the current macro-molecular docking tools are effective for mostly rigid macro-molecules that do not significantly change conformation upon binding. This severely limits applicability of the approach. The goal of this project is to develop new algorithms for docking flexible molecules. Expanding the scope of the docking approaches will lead to better understanding of fundamental biological questions and will facilitate biochemical, biomedical, and biotechnology research. In addition, the methods will be used in training graduate students and teaching undergraduate and high school students.

The docking problem is to computationally determine the 3-dimensional (3D) structure of the complex formed by two unbound macro-molecules, given their individual structures. Solving this problem requires detailed sampling of an energy-based scoring function over a complex search space. Due to the high cost of energy function evaluation and the extremely rugged energy landscape the sampling is computationally challenging. The goal of this proposal is to develop algorithms for fast energy evaluation and corresponding sampling methods for the modeling of interactions among flexible macro-molecules with many degrees of freedom. The basic idea is to represent the molecular system as a forest (i.e., a set of disjoint trees) of rigid clusters connected by hinges, calculating the interaction energies for all relative orientations of all rigid clusters on grids, and storing the calculated energy grids on the manifold of all rotational and translational states. The energy of the system can be then obtained by summing up the interaction energies extracted from the pre-calculated lookup tables of cluster interaction energies. The key advantage of proposed approach is a significant reduction in the cost of energy evaluation. In addition, the search is performed on the manifold of appropriate dimension. The major challenge for making the above approach applicable in practice is storing the large interaction energy tables of rigid clusters in memory. To solve this problem, the investigators will develop approaches and algorithms to compress interaction energy data using wavelets, resulting in good accuracy, high level of compression, and fast lookup speed. In addition, specialized sampling algorithms using wavelet compressed energy grids on the search manifold will be developed, and their application to flexible macro-molecular docking problems will be studied. The developed algorithms will be evaluated for performance and behavior. The approaches will be released as an open source software library, as well as made available to end users of docking by means of the ClusPro server.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Project Start
Project End
Budget Start
2018-06-15
Budget End
2021-05-31
Support Year
Fiscal Year
2018
Total Cost
$460,000
Indirect Cost
Name
State University New York Stony Brook
Department
Type
DUNS #
City
Stony Brook
State
NY
Country
United States
Zip Code
11794