Drugs undergo extensive testing in animals and clinical trials in humans before they are marketed for widespread use in the population. Pre-market testing produces reasonably high quality information about the efficacy of the drug as a treatment for the condition for which it was approved, but gives a very incomplete picture of the drug's safety. Post-marketing surveillance currently relies mainly on voluntary reporting to the FDA by health care professionals (and recently, patients themselves) through MedWatch, the FDA's safety information and adverse event reporting program. Self-reported patient information captures a valuable perspective that has been found to be of similar quality to that provided by health professionals, and currently it is only captured via the formal MedWatch form. The overarching goal of this application is to deploy the infrastructure needed to explore the value of informal social network postings as a source of """"""""signals"""""""" of potential adverse drug reactions soon after the drugs hit the market, paying particular attention at the value such information might have to detect adverse events earlier than currently possible, and to detect effects not easily captured by traditional means. Despite the significant challenge of processing colloquial text, our prototype study in this direction showed promising performance in identifying adverse reactions mentioned in these postings, with significant correlations between the effects mentioned by the public and those documented for the drugs we studied.
Specific aims to be addressed include: 1). To establish the infrastructure that enables processing of online user comments about the drug on health-related social network websites. Particularly, we seek to recognize and extract mentions of adverse effects in those informal postings, and to map them to standard terminology. We will build on our preliminary lexical approach for finding the mentions, and propose a variation of machine learning (commonly referred to as active learning) where the machine learning framework has the ability to control what instances will be selected for use in the training data, among other innovative semantic approaches to normalization (mapping of the mentions to established, formal terms) and sentiment analysis (to discover whether a mention is reporting a positive or a negative effect);2) To evaluate the sensitivity and specificity of the extraction and identification systems, as well as the predictive value of the extracted knowledge through specific case studies of a set of drugs with well known adverse reactions and by monitoring postings about a select group of drugs released since 2007. Our existing manually annotated gold standard will be expanded through a dedicated annotation effort led by a pharmacologist (Karen Smith). 3) To compare the knowledge extracted from patient comments to what is derived from the established drug safety monitoring scheme overseen by the FDA. We recognize that the data obtained through the deployed infrastructure would not be able to be used to define an ADR standing on its own. However, if this method is validated, it could provide useful signals to complement the already established processes and data sources.

Public Health Relevance

Adverse drug reactions are currently listed as one of the top 10 causes of death in the US. Identifying adverse effects of drugs after they are publicly marketed depends mainly on voluntary reporting to the FDA by health care professionals and recently, patients themselves, via a formal online form. The goal of this project is to develop the tools needed to exploit the numerous informal social network postings that patients make to health-related social networks such as Daily Strength as a source of signals of potential adverse drug reactions soon after the drugs hit the market. These tools could provide useful signals about adverse effects earlier than currently possible, hastening the FDA's intervention and reducing the impact that an adverse effect can have on public health.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZLM1-ZH-C (01))
Program Officer
Vanbiervliet, Alan
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Arizona State University-Tempe Campus
Biomedical Engineering
Schools of Engineering
United States
Zip Code
Smith, Karen; Golder, Su; Sarker, Abeed et al. (2018) Methods to Compare Adverse Events in Twitter to FAERS, Drug Information Databases, and Systematic Reviews: Proof of Concept with Adalimumab. Drug Saf 41:1397-1410
Sarker, Abeed; Gonzalez-Hernandez, Graciela (2018) An unsupervised and customizable misspelling generator for mining noisy health-related text sources. J Biomed Inform 88:98-107
Klein, Ari Z; Sarker, Abeed; Cai, Haitao et al. (2018) Social media mining for birth defects research: A rule-based, bootstrapping approach to collecting data for rare health-related events on Twitter. J Biomed Inform 87:68-78
Sarker, Abeed; Nikfarjam, Azadeh; Gonzalez, Graciela (2016) SOCIAL MEDIA MINING SHARED TASK WORKSHOP. Pac Symp Biocomput 21:581-92
Sarker, Abeed; O'Connor, Karen; Ginn, Rachel et al. (2016) Social Media Mining for Toxicovigilance: Automatic Monitoring of Prescription Medication Abuse from Twitter. Drug Saf 39:231-40
Sullivan, Ryan; Sarker, Abeed; O'Connor, Karen et al. (2016) FINDING POTENTIALLY UNSAFE NUTRITIONAL SUPPLEMENTS FROM USER REVIEWS WITH TOPIC MODELING. Pac Symp Biocomput 21:528-39
Korkontzelos, Ioannis; Nikfarjam, Azadeh; Shardlow, Matthew et al. (2016) Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts. J Biomed Inform 62:148-58
Gonzalez, Graciela H; Tahsin, Tasnia; Goodale, Britton C et al. (2016) Recent Advances and Emerging Applications in Text and Data Mining for Biomedical Discovery. Brief Bioinform 17:33-42
Paul, Michael J; Sarker, Abeed; Brownstein, John S et al. (2016) SOCIAL MEDIA MINING FOR PUBLIC HEALTH MONITORING AND SURVEILLANCE. Pac Symp Biocomput 21:468-79
Sarker, Abeed; Ginn, Rachel; Nikfarjam, Azadeh et al. (2015) Utilizing social media data for pharmacovigilance: A review. J Biomed Inform 54:202-12

Showing the most recent 10 out of 16 publications