Molecular interactions play a central role in all biological processes. Akin to the complete sequencing of genomes, complete descriptions of interactomes is a fundamental step towards a deeper understanding of biological processes, and has a vast potential to impact systems biology, genomics, molecular biology and therapeutics. Protein-protein interactions (PPIs) and protein-RNA interactions (PRIs) are of particular interest as they are critical in maintenance of cellular integrity, metabolism, transcription/translation, and cell-cell communication. Although high-throughput experimental PPI and PRI data is rapidly accumulating, building complete and confident datasets requires multiple replicates of expensive screens. This proposal aims to develop new methods that will significantly advance our efforts at structure-based approaches to better predict PPIs and RPIs and boost confidence in emerging high-throughput (HTP) data with the goal of comprehensive interactome mapping at lower cost. Taken together, these methods will vastly expand our understanding of macromolecular networks. We will continue to devise structure-based methods for protein-protein interaction prediction and branch out to methods for protein-RNA interaction prediction;this represents a major shift from the purely sequence-based approaches that most bioinformatics approaches utilize to predict We will also build computational frameworks for boosting confidence in HTP protein-protein and protein-RNA interaction datasets using structure-based approaches;these frameworks will provide a comprehensive assessment of in-house and public HTP data, with potential biomedical applications such as heat shock protein-kinase interactions related to development for cancer therapeutics, MAPK6's role in a cancer-related signaling network, and (long non-coding) RNA-protein binding roles in neurodegenerative disease. Finally, we will computationally screen for PPIs and PRIs at the genome scale and expand our Struct2Net webserver to disseminate tools based on our methods and results to the community. An increasing number of HTP interaction datasets are being determined, thus presenting new opportunities to leverage this data in conjunction with structural insights to map binding sites and to uncover the underlying molecular mechanisms of cellular functions. molecular interactions and will enhance coverage and accuracy of the complete interactome. Successful completion of these aims will result in computational methods that will significantly increase our confidence in high-throughput data on protein-protein and protein-RNA interactions and will reveal fundamental aspects of their functioning, as well as testable hypotheses for experimental investigations. All developed software will be made publicly available.

Public Health Relevance

Biological processes are carried out through thousands of interactions between various types of molecules (the Interactome) that play fundamental roles in all biomedical processes including the maintenance of cellular integrity, metabolism, transcription/translation, and cell-cell communication. Understanding these interaction networks on a large scale will empower both rational, targeted drug design and more intelligent disease management. In this project, we develop computational methods for structure-based prediction of protein-protein and protein- RNA interactions, and integrate these predictions with available high-throughput genomic data to predict the Interactomes of entire species'genomes.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Wu, Mary Ann
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Massachusetts Institute of Technology
Organized Research Units
United States
Zip Code
Ma, Cheng-Yu; Chen, Yi-Ping Phoebe; Berger, Bonnie et al. (2017) Identification of protein complexes by integrating multiple alignment of protein interaction networks. Bioinformatics 33:1681-1688
Orenstein, Yaron; Puccinelli, Robert; Kim, Ryan et al. (2017) Optimized Sequence Library Design for Efficient In Vitro Interaction Mapping. Cell Syst 5:230-236.e5
Liu, Yang; Palmedo, Perry; Ye, Qing et al. (2017) Enhancing Evolutionary Couplings with Deep Convolutional Neural Networks. Cell Syst :
Khurana, Vikram; Peng, Jian; Chung, Chee Yeun et al. (2017) Genome-Scale Networks Link Neurodegenerative Disease Genes to ?-Synuclein through Specific Molecular Pathways. Cell Syst 4:157-170.e14
Toth-Petroczy, Agnes; Palmedo, Perry; Ingraham, John et al. (2016) Structured States of Disordered Proteins from Genomic Sequences. Cell 167:158-170.e12
Cho, Hyunghoon; Berger, Bonnie; Peng, Jian (2016) Compact Integration of Multi-Network Topology for Functional Analysis of Genes. Cell Syst 3:540-548.e5
Nazeen, Sumaiya; Palmer, Nathan P; Berger, Bonnie et al. (2016) Integrative analysis of genetic data sets reveals a shared innate immune component in autism spectrum disorder and its co-morbidities. Genome Biol 17:228
Orenstein, Yaron; Wang, Yuhao; Berger, Bonnie (2016) RCK: accurate and efficient inference of sequence- and structure-based protein-RNA binding models from RNAcompete data. Bioinformatics 32:i351-i359
Sahni, Nidhi; Yi, Song; Taipale, Mikko et al. (2015) Widespread macromolecular interaction perturbations in human genetic disorders. Cell 161:647-660
Cho, Hyunghoon; Berger, Bonnie; Peng, Jian (2015) Diffusion Component Analysis: Unraveling Functional Topology in Biological Networks. Res Comput Mol Biol 9029:62-64

Showing the most recent 10 out of 41 publications