While the massive amount of molecular bioactivity data creates new opportunities, it also hinders the way scientists conduct biomedical research due to the inherent difficulty of processing many separate and heterogeneous data sources. The quality and type of data input often limits the project outcome. To improve research outcome, access to all available data and multiple alternative hypothesis testing are essential. Targeting less experienced end-users, we will develop tools that facilitate """"""""jumps"""""""" in the small molecule / bioactivity / biomedical data area, leading from one potential solution to another, encouraging users to explore multiple, alternate hypotheses. We will integrate data from multiple bioactivity databases, including PubChem, ChemBank, ChEMBL, PDSP and WOMBAT, into one centralized system. We will develop advanced chemical pattern recognition algorithms and deliver a Cytoscape-based visualization tool for the global exploration of relationships between chemical patterns and biological activities/targets. We will achieve this via three Specific Aims: 1. Create one simple unified interface for many heterogeneous databases, CARLSBAD (Confederated Annotated Research Libraries of Small molecule Biological Activity Data);the data will reconcile small molecule bioactivity data across multiple sources for human, rat and mouse targets. 2. Develop advanced algorithms for chemical pattern detection and annotation;we will detect the Maximum Overlapping Set (MOS) and HierS (hierarchical scaffolds) and annotate chemicals in CARSLBAD accordingly. 3. Develop a Cytoscape plugin for the visualization and exploration of chemical pattern bioactivity networks. Via MOS/HierS patterns, users will be able to identify target specific chemical signatures (determinants for activity and selectivity);in the absence of specific signals, these patterns will serve as rationale for off- target and promiscuous bioactivity prediction. Storing unique target-ligand bioactivity data as well as chemical patterns, CARLSBAD will be designed, implemented and maintained on an enterprise platform for use by the scientific community. The new Cytoscape plugin will integrate with existing core components and plugins to bridge across chemistry and biology in a multi-disciplinary manner.

Public Health Relevance

The proposed research aims to empower the chemistry and biology research community with an innovative, network-based tool for mining vast amounts of chemical and biological data. It will provide an effective and improved way for researchers to evaluate, visualize and explore small molecule bioactivity data in a multi-disciplinary manner, thus leading to improved output in human health research.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Exploratory/Developmental Grants (R21)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Lyster, Peter
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of New Mexico Health Sciences Center
Schools of Medicine
United States
Zip Code
Zahoránszky-K?halmi, Gergely; Bologa, Cristian G; Oprea, Tudor I (2016) Impact of similarity threshold on the topology of molecular similarity networks and clustering outcomes. J Cheminform 8:16
Yang, Jeremy J; Ursu, Oleg; Lipinski, Christopher A et al. (2016) Badapple: promiscuity patterns from noisy evidence. J Cheminform 8:29
Bologa, Cristian G; Ursu, Oleg; Oprea, Tudor I et al. (2013) Emerging trends in the discovery of natural product antibacterials. Curr Opin Pharmacol 13:678-87
Mathias, Stephen L; Hines-Kay, Jarrett; Yang, Jeremy J et al. (2013) The CARLSBAD database: a confederated database of chemical bioactivities. Database (Oxford) 2013:bat044
Kim Kjaerulff, Sonny; Wich, Louis; Kringelum, Jens et al. (2013) ChemProt-2.0: visual navigation in a disease chemical biology database. Nucleic Acids Res 41:D464-9
Manallack, David T; Prankerd, Richard J; Yuriev, Elizabeth et al. (2013) The significance of acid/base properties in drug discovery. Chem Soc Rev 42:485-96
Manallack, David T; Prankerd, Richard J; Nassta, Gemma C et al. (2013) A chemogenomic analysis of ionization constants--implications for drug discovery. ChemMedChem 8:242-55
Bologa, Cristian G; Oprea, Tudor I (2012) Compound collection preparation for virtual screening. Methods Mol Biol 910:125-43
Oprea, Tudor I; Taboureau, Olivier; Bologa, Cristian G (2012) Of possible cheminformatics futures. J Comput Aided Mol Des 26:107-12
Broccatelli, Fabio; Cruciani, Gabriele; Benet, Leslie Z et al. (2012) BDDCS class prediction for new molecular entities. Mol Pharm 9:570-80

Showing the most recent 10 out of 17 publications