Large-scale protein interaction networks have been determined experimentally for several organisms, and computational analysis of these networks provides new opportunities to uncover protein functions and pathways. At the same time, despite improvements in high-throughput technologies, it is still not feasible in the near future to apply them to all sequenced genomes. Thus, for the vast majority of sequenced genomes, only a small fraction of known protein interactions have been experimentally determined, and novel computational approaches provide a promising, alternative means for building large, high- confidence interaction maps. The broad, long-term goal of this research is to build a comprehensive research program for understanding protein interactions, by developing algorithms for the complementary problems of analyzing and predicting protein interaction maps.
Our specific aims are: (1) To develop algorithms that exploit the topology of whole-genome protein interaction maps and the relationships between protein functions, in order to make novel predictions about a protein's biological process. (2) To build a system for interrogating protein interaction networks using """"""""templates"""""""" specifying common patterns of interactions or pathways, in order to help uncover novel instances. (3) To develop a general structural bioinformatics approach for leveraging properties of specific protein interaction interfaces, and to apply this approach in order to help predict Cys2HiS2 zinc finger protein-DNA interactions at the genomic scale. Taken together, we hope that the proposed tools will significantly advance the state-of-the-art in computational approaches for characterizing proteins within the context of their cellular interactions, pathways and networks. All software and predictions will be made publicly available via the internet.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Remington, Karin A
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Princeton University
Biostatistics & Other Math Sci
Schools of Engineering
United States
Zip Code
Persikov, Anton V; Wetzel, Joshua L; Rowland, Elizabeth F et al. (2015) A systematic survey of the Cys2His2 zinc finger DNA-binding landscape. Nucleic Acids Res 43:1965-84
Nadimpalli, Shilpa; Persikov, Anton V; Singh, Mona (2015) Pervasive variation of transcription factor orthologs contributes to regulatory network evolution. PLoS Genet 11:e1005011
Pritykin, Yuri; Ghersi, Dario; Singh, Mona (2015) Genome-Wide Detection and Analysis of Multifunctional Genes. PLoS Comput Biol 11:e1004467
Ochoa, Alejandro; Storey, John D; Llinás, Manuel et al. (2015) Beyond the E-Value: Stratified Statistics for Protein Domain Prediction. PLoS Comput Biol 11:e1004509
Persikov, Anton V; Singh, Mona (2014) De novo prediction of DNA-binding specificities for Cys2His2 zinc finger proteins. Nucleic Acids Res 42:97-108
Ghersi, Dario; Singh, Mona (2014) molBLOCKS: decomposing small molecule sets and uncovering enriched fragments. Bioinformatics 30:2081-3
Jiang, Peng; Singh, Mona (2014) CCAT: Combinatorial Code Analysis Tool for transcriptional regulation. Nucleic Acids Res 42:2833-47
Ghersi, Dario; Singh, Mona (2014) Interaction-based discovery of functionally important genes in cancers. Nucleic Acids Res 42:e18
Persikov, Anton V; Rowland, Elizabeth F; Oakes, Benjamin L et al. (2014) Deep sequencing of large library selections allows computational discovery of diverse sets of zinc fingers that bind common targets. Nucleic Acids Res 42:1497-508
Pritykin, Yuri; Singh, Mona (2013) Simple topological features reflect dynamics and modularity in protein interaction networks. PLoS Comput Biol 9:e1003243

Showing the most recent 10 out of 30 publications