Protein-protein and protein-nucleic acid interactions form the basis of the most fundamental processes in the living cell. These interactions are the central topic of genomics and systems biology research and are involved in a variety of diseases, including cancer. Due to the lack of experimental data, computational inference often is the only way to identify protein interaction interfaces and to study their properties. However, progress in this important area has been hampered by the absence of efficient and well-performing computational methods capable of dealing with conformational changes associated with macromolecular interactions. The objective of this proposal is to develop accurate high-throughput computational methods for predicting protein interfaces involved in protein-protein and protein-nucleic acid interactions. First, we will create general-purpose benchmark datasets for developing and validating computational methods for the analysis of protein interaction interfaces. Second, we will use these datasets to develop and validate novel high-throughput methodology for predicting protein interaction sites that explicitly accounts for conformational flexibility. All software and datasets created during this project will be made publicly available for on-line access on a dedicated web- server. Computational methodology developed as an outcome of this proposal will facilitate macromolecular docking and allow researchers to perform genome-scale analyses of protein interactions involved in human diseases, including cancer-related interactions such as the ones involved in gene regulation and signal transduction. Therefore, the expected outcomes will have a direct beneficial impact on human health.
The proposed research is significant because it will address the fundamental problem of protein-protein and protein-nucleic acid interactions. It will facilitate research areas directly related to human health, including studies of protein interactions involved in various diseases such as cancer.