Major depressive disorder is one of the most common debilitating illnesses in the United States, with a lifetime prevalence of 16.2%. Currently, nationwide mental health surveillance takes the form of large-scale telephone- based surveys. These surveys have high running costs and require teams of human telephone operators. Even the largest system, the Behavioral Risk Factor Surveillance System, reaches only 0.13% of the US population. Twitter (and other microblog services) offers a rich, if terse, multilingual source of real time data for public health surveillance. Natural Language Processing (NLP) provides techniques and resources to unlock data from text. We propose using Twitter and NLP as a cost-effective and flexible approach to augmenting current telephone- based surveillance methods for population level depression monitoring. This grant application has two major strands. First, investigating ethical issues and challenges to privacy that emerge with the use of Twitter data for public health surveillance (Aim One). Second, developing techniques and resources for real-time public health surveillance for mental illness from Twitter (Aim Two &Aim Three).
Aim One seeks to investigate and codify our responsibilities as researchers towards Twitter users by engaging with those users directly.
With Aim Two, we will build and evaluate Natural Language Processing resources - algorithms, lexicons and taxonomies - to support the identification of depression symptoms in Twitter data.
For Aim Three, we will build and evaluate Natural Language Processing modules and services that use Twitter as a data source for monitoring depression levels in the community. The significance of the proposed work lies in three areas. First, our investigations - both empirical and theoretical - will explore ethical issues in the use of Twitter for public health surveillance. This work has the potential to guide future research in the area. Second, in developing and evaluating algorithms and resources for identifying depression from tweets, we are contributing foundational work to the field of NLP. Third, developing these algorithms and resources will provide the bedrock for building social media based surveillance systems which will provide a cost effective means of augmenting current mental health surveillance practice. This proposal is innovative in both its application area (microblogs have not been used before for mental health surveillance), its focus on using NLP to identify depressive symptoms for public health, and in the central role that qualitative bioethical research will play in guiding the work.

Public Health Relevance

The proposed research focuses on using advanced Natural Language Processing methods to mine microblog data - in this case, Twitter - for mental health surveillance (specifically, depression surveillance), in order to augment current telephone-based mental health surveillance systems. The research has public health at its core.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Research Transition Award (R00)
Project #
Application #
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Vanbiervliet, Alan
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Utah
Schools of Medicine
Salt Lake City
United States
Zip Code
Park, Albert; Conway, Mike; Chen, Annie T (2018) Examining Thematic Similarity, Difference, and Membership in Three Online Mental Health Communities from Reddit: A Text Mining and Visualization Approach. Comput Human Behav 78:98-112
Park, Albert; Conway, Mike (2018) Harnessing Reddit to Understand the Written-Communication Challenges Experienced by Individuals With Mental Health Disorders: Analysis of Texts From Mental Health Communities. J Med Internet Res 20:e121
Mowery, Danielle; Smith, Hilary; Cheney, Tyler et al. (2017) Understanding Depressive Symptoms and Psychosocial Stressors on Twitter: A Corpus-Based Study. J Med Internet Res 19:e48
Park, Albert; Conway, Mike (2017) Longitudinal Changes in Psychological States in Online Health Community Members: Understanding the Long-Term Effects of Participating in an Online Depression Community. J Med Internet Res 19:e71
Park, Albert; Conway, Mike (2017) Tracking Health Related Discussions on Reddit for Public Health Applications. AMIA Annu Symp Proc 2017:1362-1371
Woo, Daniel; Debette, Stephanie; Anderson, Christopher (2017) 20th Workshop of the International Stroke Genetics Consortium, November 3-4, 2016, Milan, Italy: 2016.036 ISGC research priorities. Neurol Genet 3:S12-S18
Doan, Son; Ritchart, Amanda; Perry, Nicholas et al. (2017) How Do You #relax When You're #stressed? A Content Analysis and Infodemiology Study of Stress-Related Tweets. JMIR Public Health Surveill 3:e35
Conway, Mike; O'Connor, Daniel (2016) Social Media, Big Data, and Mental Health: Current Advances and Ethical Implications. Curr Opin Psychol 9:77-82
Mikal, Jude; Hurst, Samantha; Conway, Mike (2016) Ethical issues in using Twitter for population-level depression monitoring: a qualitative study. BMC Med Ethics 17:22
Alvaro, Nestor; Conway, Mike; Doan, Son et al. (2015) Crowdsourcing Twitter annotations to identify first-hand experiences of prescription drug use. J Biomed Inform 58:280-287

Showing the most recent 10 out of 12 publications