Surveillance and monitoring of risk behavior and disease is a top priority of many health-related agencies and organizations, including the UN AIDS, Centers for Disease Control and Prevention (CDC), and local public health and epidemiology departments. Identification and localization of changes in risk behaviors and disease can dramatically improve public health outcomes and reduce health-related costs, by providing data on where interventions are needed and how to direct public health efforts. For example, HIV researchers, public health departments, and government organizations, have attempted to identify and monitor HIV risk behavior (e.g., sexual intercourse and illicit drug use) and HIV outbreaks to improve prevention and treatment efforts and curb a growing HIV epidemic. Social media use has been rapidly increasing, and data from these technologies might be leveraged for identification of HIV risk behaviors, such as sexual- and drug-related risk behaviors. Although researchers have developed methods of using social media to monitor HIV and public health behaviors and outcomes, these methods require extensive manual time, technical expertise, and multiple software platforms to process these big data. Advances in technology, including technology infrastructure, data mining, and machine learning approaches, can be leveraged to provide tools that can be used to create a single automated platform for extracting free-text social media conversations, labeling these conversations to identify health risk-related behaviors, and using these labels to monitor disease outbreaks. We propose to create a single automated platform that collects social media (Twitter) data; identifies, codes, and labels tweets that suggest HIV risk behaviors; and provides an output that is acceptable for HIV researchers, public health workers, and policymakers to monitor HIV risk behaviors and outcomes. The tools developed from this application will be open source, tailored for use for epidemiologists and public health departments, and will be available for integration with other software tools to improve the effectiveness of public health monitoring systems.

Public Health Relevance

Surveillance and monitoring of health-related risk behavior is a top priority of many agencies and organizations. This project is particularly significant because it seeks to develop software to allow researchers and epidemiologists to analyze real-time free-text conversations from social media to monitor HIV and public health-related risk behaviors and disease outbreaks. Results can be used to provide additional observational and surveillance data and to improve future intervention delivery.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project--Cooperative Agreements (U01)
Project #
5U01HG008488-02
Application #
9146666
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Sofia, Heidi J
Project Start
2015-09-18
Project End
2018-05-31
Budget Start
2016-06-01
Budget End
2017-05-31
Support Year
2
Fiscal Year
2016
Total Cost
Indirect Cost
Name
University of California Los Angeles
Department
Biostatistics & Other Math Sci
Type
Biomed Engr/Col Engr/Engr Sta
DUNS #
092530369
City
Los Angeles
State
CA
Country
United States
Zip Code
90095
Garett, Renee; Liu, Sam; Young, Sean D (2018) The Relationship Between Social Media Use and Sleep Quality among Undergraduate Students. Inf Commun Soc 21:163-173
Young, Sean D; Mercer, Neil; Weiss, Robert E et al. (2018) Using social media as a tool to predict syphilis. Prev Med 109:58-61
Liu, Sam; Young, Sean D (2018) A survey of social media data analysis for physical activity surveillance. J Forensic Leg Med 57:33-36
Young, Sean D; Torrone, Elizabeth A; Urata, John et al. (2018) Using Search Engine Data as a Tool to Predict Syphilis. Epidemiology 29:574-578
Goldfarb, Dennis; Lafferty, Michael J; Herring, Laura E et al. (2018) Approximating Isotope Distributions of Biomolecule Fragments. ACS Omega 3:11383-11391
Young, Sean D; Yu, Wenchao; Wang, Wei (2017) Toward Automating HIV Identification: Machine Learning for Rapid Identification of HIV-Related Social Media Data. J Acquir Immune Defic Syndr 74 Suppl 2:S128-S131
Yu, Wenchao; Aggarwal, Charu C; Wang, Wei (2017) Temporally Factorized Network Modeling for Evolutionary Network Analysis. Proc Int Conf Web Search Data Min 2017:455-464
Liu, Sam; Zhu, Miaoqi; Yu, Dong Jin et al. (2017) Using Real-Time Social Media Technologies to Monitor Levels of Perceived Stress and Emotional State in College Students: A Web-Based Questionnaire Study. JMIR Ment Health 4:e2
Carey, Michael J; Jacobs, Steven; Tsotras, Vassilis J (2016) Breaking BAD: A Data Serving Vision for Big Active Data. Proc Int Workshop Distrib Event Based Syst 2016:181-186
Cheng, Wei; Guo, Zhishan; Zhang, Xiang et al. (2016) CGC: A Flexible and Robust Approach to Integrating Co-Regularized Multi-Domain Graph for Clustering. ACM Trans Knowl Discov Data 10:

Showing the most recent 10 out of 18 publications