The need to monitor unintended effects of approved drugs has been highlighted by several recent high-profile events in which fatal side effects of drugs were detected after their release to market. Notoriously, the Cox-2 inhibitor rofecoxib (Vioxx) was withdrawn from market on account of evidence suggesting that treatment with the drug increased the rate of myocardial infarction. More recently, proton pump inhibitors have been identified with a host of previously undetected serious side effects, including chronic kidney disease. Statistical analyses of several sorts of data have been undertaken in an effort to mitigate the morbitidy and mortality resulting from such side effects by accelerating their detection. These include data from adverse event reporting systems, Electronic Health Records (EHR) and administrative claims data, social media communication and consumer search logs. Each of these sources presents challenges related to data completeness, accuracy, quality and representation, as well as the potential for bias. Though methods for combining multiple data sources show some promise as a way to address their particular inadequacies, strongly correlated drug-event pairs emerging from secondary analysis of observational data must ultimately be reviewed by domain experts to assess their implications. As the availability of the prerequisite expertise is limited, there is a pressing need for new methods to distinguish plausibly causal relationships from the large number of false positive associations that may emerge from large-scale analysis of observational data. In the proposed research, we will develop automated methods through which large amounts of knowledge extracted from the biomedical literature are used to constrain the parameterization of predictive models of large data sets. These methods will leverage high-dimensional distributed vector representations of conceptual relations extracted from the literature to integrate extracted knowledge into predictive models of observational data. Our hypothesis is that the predictions that result from such joint models will be both biologically plausible and strongly associated, resulting in more accurate predictions than those that can be obtained through estimation of correlation from observational data alone. The developed methods will be evaluated formatively for accuracy against a set of drug/side-effect reference standards, and summatively for their ability to to predict label changes such as ?black box? warnings using historical data and knowledge to estimate their ?time-to-detection? of safety concerns. In addition, we will develop and evaluate an interactive interface permitting users to explore the evidence used by the resulting models to make predictions, by retrieving supporting assertions from the literature and statistics from observational data. If successful, the proposed research will provide the means to identify plausible drug-event pairs for regulatory purposes, mitigating consequent morbidity and mortality. In addition, the methods will provide a generalizable approach that can be used to apply knowledge derived from the biomedical literature to draw robust inferences from observational clinical data.

Public Health Relevance

The need to monitor unintended effects of medications has been highlighted by several high-profile events in which fatal side effects of approved drugs were detected after their release to market. In the proposed research, we will develop and evaluate methods to identify biologically plausible adverse drug events using both observational data and knowledge extracted from the biomedical literature. If successful, these methods will provide the means for earlier detection of harmful drug effects, limiting consequent morbidity and mortality.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Research Project (R01)
Project #
5R01LM011563-06
Application #
9707917
Study Section
Special Emphasis Panel (ZLM1)
Program Officer
Sim, Hua-Chuan
Project Start
2013-09-01
Project End
2021-05-31
Budget Start
2019-06-01
Budget End
2020-05-31
Support Year
6
Fiscal Year
2019
Total Cost
Indirect Cost
Name
University of Washington
Department
Other Health Professions
Type
Schools of Medicine
DUNS #
605799469
City
Seattle
State
WA
Country
United States
Zip Code
98195
Mower, Justin; Subramanian, Devika; Cohen, Trevor (2018) Learning predictive models of drug side-effect relationships from distributed representations of literature-derived semantic predications. J Am Med Inform Assoc 25:1339-1350
Cai, Ruichu; Liu, Mei; Hu, Yong et al. (2017) Identification of adverse drug-drug interactions through causal association rule discovery from spontaneous adverse event reports. Artif Intell Med 76:7-15
Yu, Zhiguo; Wallace, Byron C; Johnson, Todd et al. (2017) Retrofitting Concept Vector Representations of Medical Concepts to Improve Estimates of Semantic Similarity and Relatedness. Stud Health Technol Inform 245:657-661
Amith, Muhammad; Cunningham, Rachel; Savas, Lara S et al. (2017) Using Pathfinder networks to discover alignment between expert and consumer conceptual knowledge from online vaccine content. J Biomed Inform 74:33-45
Cohen, Trevor; Widdows, Dominic (2017) Embedding of semantic predications. J Biomed Inform 68:150-166
Mower, Justin; Subramanian, Devika; Shang, Ning et al. (2016) Classification-by-Analogy: Using Vector Representations of Implicit Relationships to Identify Plausibly Causal Drug/Side-effect Relationships. AMIA Annu Symp Proc 2016:1940-1949
Malec, Scott A; Wei, Peng; Xu, Hua et al. (2016) Literature-Based Discovery of Confounding in Observational Clinical Data. AMIA Annu Symp Proc 2016:1920-1929
Widdows, Dominic; Cohen, Trevor (2015) Reasoning with Vectors: A Continuous Model for Fast Robust Inference. Log J IGPL 23:141-173
Shang, Ning; Xu, Hua; Rindflesch, Thomas C et al. (2014) Identifying plausible adverse drug reactions using knowledge extracted from the literature. J Biomed Inform 52:293-310
Cohen, T; Widdows, D; Stephan, C et al. (2014) Predicting high-throughput screening results with scalable literature-based discovery methods. CPT Pharmacometrics Syst Pharmacol 3:e140