Mining Social Media Big Data for Toxicovigilance: Automating the Monitoring of Prescription Medication Abuse via Natural Language Processing and Machine Learning Methods

Sarker, Abeed

Abstract

The problem of prescription medication (PM) abuse has reached epidemic proportions in the United States. According to a 2014 report by the Director of the National Institute on Drug Abuse (NIDA), an estimated 52 million people, have been involved in the non-medical use of PMs? a significant portion of which can be classified as abuse. PMs that are commonly abused include opioids, central nervous system depressants and stimulants, and the consequences of their abuse may be severe. Increases in PM misuse and abuse over the last 15 years have resulted in increased emergency department visits, rates of addiction and overdose deaths. Due to the rapidly escalating morbidity and mortality, it is now receiving national attention. The opioid crisis, which has its root in opioid-based PM abuse, has been declared a national emergency by the president of the United States. Despite the problems associated with PM abuse, surveillance programs such as prescription drug monitoring programs (PDMPs) are inadequate and suffer from numerous shortcomings, thus limiting their usefulness in real life. Studies evaluating the long-term effects of distinct classes of PMs on cohorts of abusers are scarce and expensive to conduct. To better characterize the problem and to monitor it in real-time, new sources of information need to be identified and novel monitoring techniques need to be developed. To address these problems, our project aims to utilize social media data for performing toxicovigilance. Social media encapsulates an abundance of knowledge about PM abuse and the abusers in the form of noisy natural language text. At the heart of the proposed approach is a machine learning system that can automatically distinguish between `abuse' and `non-abuse' indicating user posts collected from social media. Using this classification system, users will be categorized into multiple groups?(i) abusers, (ii) medical users and (iii) non users. The developed system will collect longitudinal data for users exposed the selected PMs via periodic collection of their publicly available posts/discussions and automatically categorize them based on age, gender and additional demographic feature, when possible. This will enable the conducting of observational studies on targeted cohorts, involving hundreds of thousands of cohort members. The cohort studies will focus on analyzing the transition rates from medical use to abuse for distinct PMs and transition rates from abuse of PMs to illicit analogs. Implementation of this data-centric framework, which will be open source, will revolutionize the mechanism by which PM abuse monitoring is performed and enable the future development of intervention strategies targeted towards specific cohorts, at the most effective time periods.

Public Health Relevance

Prescription Medication (PM) abuse is a major epidemic in the United States, and monitoring and studying the characteristics of the PM abuse problem requires the development of novel approaches. Social media encapsulates an abundance of data about PM abuse from different demographics, but extracting that data and converting it to knowledge requires advanced natural language processing and data-centric artificial intelligence systems. Our proposed social media mining framework will automate the process of big data to knowledge conversion for PM abuse, providing crucial insights to toxicologists about targeted populations and enabling the future development of directed intervention strategies.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute on Drug Abuse (NIDA)
Type: Research Project (R01)
Project #: 5R01DA046619-04
Application #: 9933852
Study Section: Biomedical Computing and Health Informatics Study Section (BCHI)
Program Officer: Obrien, Moira

Project Start: 2019-09-01
Project End: 2022-05-31
Budget Start: 2020-06-01
Budget End: 2021-05-31
Support Year: 4
Fiscal Year: 2020
Total Cost
Indirect Cost

Institution

Name: Emory University
Department: Biomedical Engineering
Type: Schools of Medicine
DUNS #: 066469933

City: Atlanta
State: GA
Country: United States
Zip Code: 30322

Related projects


NIH 2020 R01 DA	Mining Social Media Big Data for Toxicovigilance: Automating the Monitoring of Prescription Medication Abuse via Natural Language Processing and Machine Learning Methods Sarker, Abeed H. / Emory University
NIH 2019 R01 DA	Mining Social Media Big Data for Toxicovigilance: Automating the Monitoring of Prescription Medication Abuse via Natural Language Processing and Machine Learning Methods Sarker, Abeed H. / University of Pennsylvania
NIH 2019 R01 DA	Mining Social Media Big Data for Toxicovigilance: Automating the Monitoring of Prescription Medication Abuse via Natural Language Processing and Machine Learning Methods Sarker, Abeed H. / Emory University
NIH 2018 R01 DA	Mining Social Media Big Data for Toxicovigilance: Automating the Monitoring of Prescription Medication Abuse via Natural Language Processing and Machine Learning Methods Sarker, Abeed H. / University of Pennsylvania

Publications

Sarker, Abeed; Gonzalez-Hernandez, Graciela (2018) An unsupervised and customizable misspelling generator for mining noisy health-related text sources. J Biomed Inform 88:98-107

Comments

Be the first to comment on Abeed Sarker's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: