Clinical trials, which test the safety and efficacy of drugs in a controlled population, cannot identify all safety issues associated with drugs because the size and characteristics of the target population, duration of use, the concomitant disease conditions and therapies differ markedly in actual usage conditions. On the outpatient side, medication related morbidity and mortality in the United States is estimated to result in 100,000 deaths and $177 billion in cost annually. On the inpatient side, it is estimated that roughly 30% of hospital stays have an adverse drug event. Current one-drug-at-a-time methods for surveillance are woefully inadequate because no one monitors the "real life" situation of patients getting over 3 concomitant drugs in the context of multiple co-morbidities. In preliminary work, we have built an annotation and analysis pipeline that uses the knowledge-graph formed by public biomedical ontologies for the purpose data-mining unstructured clinical notes. We have demonstrated that we can reproduce drug safety signals from the clinical notes on average 2.7 years ahead of the issue of a drug safety alert. Using this pipeline, we propose: 1) to identfy and prioritize multi-drug combinations that are worth testing;2) to develop methods for discovering adverse event profiles of multi- drug combinations;and 3) to create an EHR derived catalogue of potential adverse events of multi-drug combinations. We will use hierarchies provided by existing public ontologies for drugs, diseases and side- effects to improve signal detection by aggregation, to reduce multiple hypothesis testing and to make a search for multi-drug side effects computationally tractable.

Public Health Relevance

We propose to combine data from electronic medical records, adverse event reports in AERS, and prior knowledge in curated knowledgebases to construct a data-driven safety profile for drugs. Successful development of novel methods will result in significant cost savings as well as a significant increase in patient safety, given the current rat (~30%) of occurrence of adverse drug events. Completion of the aims will result in the first of it kind, EHR derived, resource of adverse event profiles of drugs.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Brazhnik, Paul
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Stanford University
Internal Medicine/Medicine
Schools of Medicine
United States
Zip Code
Iyer, Srinivasan V; Harpaz, Rave; LePendu, Paea et al. (2014) Mining clinical text for signals of adverse drug-drug interactions. J Am Med Inform Assoc 21:353-62
White, R W; Harpaz, R; Shah, N H et al. (2014) Toward enhanced pharmacovigilance using patient-generated data on the internet. Clin Pharmacol Ther 96:239-46
Harpaz, Rave; Callahan, Alison; Tamang, Suzanne et al. (2014) Text mining for adverse drug events: the promise, challenges, and state of the art. Drug Saf 37:777-90
Huang, Sandy H; LePendu, Paea; Iyer, Srinivasan V et al. (2014) Toward personalizing treatment for depression: predicting diagnosis and severity. J Am Med Inform Assoc 21:1069-75