Patients, as the consumer of pharmaceutical products, are the most important contributor to drug safety surveillance. Yet studies show that the rate of patient participation is low when traditional methods of reporting drug effects-such as spontaneous reporting systems-are used. Studies have also documented that patients report their experiences in different ways and in more detail than healthcare professionals. Furthermore, traditional methods of collecting patient reports are typically slow and costly. The objective of the proposed research is to investigate whether patient experiences of drug effects can be detected directly from their posts on Twitter. We propose to retrieve and collect Twitter posts and conversations related to a list of 100 pre-selected drugs, to extract from the collection the posts that demonstrate patients'personal experiences, and to identify any effects mentioned in the drug-related personal experience Tweets. This data will be analyzed to assess the relationships between different drugs and reported beneficial or adverse effects, and to ascertain whether or not the effects have been previously reported. A data processing pipeline will be devised to automate many components of the data collection and analysis process which is expected to reduce the cost and speed up the discovery of any potential drug effects. A machine-learning-based method will be developed to identify the Twitter posts that demonstrate a user's experience. The National Library of Medicine's MetaMap software will be employed to map the word phrases identified in Twitter posts to UMLS semantic types, among which many are related drug effects. To assess the validity of our approach, healthcare professionals with knowledge and experience in drug safety surveillance will participate in verifying and annotating drug-related personal experience Twitter posts, and confirming the drug effect relationship.

Public Health Relevance

Adverse drug reactions are a leading cause of death in developing nations. Patient participation in reporting drug effects has been low, leading to under-reporting of many drug effects. This project will develop and test an innovative method to collect patient drug effect reports directly from their experience shared on Twitter, a general purpose social media platform, in a hope to improve upon traditional methods currently in practice.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Academic Research Enhancement Awards (AREA) (R15)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-HDM-X (81))
Program Officer
Vanbiervliet, Alan
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Purdue University
Schools of Arts and Sciences
West Lafayette
United States
Zip Code
Jiang, Keyuan; Chen, Tingyu; Calix, Ricardo A et al. (2018) Identifying Consumer Health Terms of Side Effects in Twitter Posts. Stud Health Technol Inform 251:273-276
Jiang, Keyuan; Feng, Shichao; Song, Qunhao et al. (2018) Identifying tweets of personal health experience through word embedding and LSTM neural network. BMC Bioinformatics 19:210
Jiang, Keyuan; Chen, Tingyu; Huang, Liyuan et al. (2018) A Data-Driven Method of Discovering Misspellings of Medication Names on Twitter. Stud Health Technol Inform 247:136-140
Keyuan Jiang; Gupta, Ravish; Gupta, Matrika et al. (2017) Identifying personal health experience tweets with deep neural networks. Conf Proc IEEE Eng Med Biol Soc 2017:1174-1177
Calix, Ricardo A; Gupta, Ravish; Gupta, Matrika et al. (2017) Deep Gramulator: Improving Precision in the Classification of Personal Health-Experience Tweets with Deep Learning. Proceedings (IEEE Int Conf Bioinformatics Biomed) 2017:1154-1159