U.S. health officials are struggling to manage health conditions, and disease outbreaks (e.g., flu) affecting underserved communities. Yet, early warnings can be found in public postings made by citizens of these communities in social apps like Twitter. Recent studies from the Pew Research Center indicate that minorities are as likely to own a mobile phone as non-minorities, and are avid users of social apps. Since public Twitter posts can be searched and accessed without having a friendship relation with the author, this platform could provide health officials with the analytics capability to track which diseases are being discussed in a given region and at a specific time. Unfortunately, the software tools to perform these analytic tasks are still in the early stages of development. Very often, the expertise to use the required big data software and machine learning programs is not readily available, limiting access to officials working with underserved communities. In this project, we seek to conduct basic research aimed at designing, implementing and testing an open-source research prototype for an integrated and scalable platform to search Twitter posts, and analyze their contents in search for clues about health conditions, thereby understanding the health issues affecting underserved communities, and making predictions about possible health conditions that might affect them in the future.
In Aim 1, we will build an automated Twitter data warehouse to collect, index, and query public posts.
In Aim 2, we will build a predictive analytics engine that uses social data to make predictions about possible outbreaks of conditions, regions that might be affected and at-risk groups. Finally, in Aim 3, we will build mobile and web apps, with a map-based interface, to query and visualize the health data. The value- added capability of our system is the ability to work as an integrated system to help analyze tweets, visualize data along disease and spatio-temporal attributes, and make predictive analytics, all under one roof. This could have a significant impact on public health disease tracking and response. The University of Puerto Rico, Mayagez (UPRM) is a Hispanic serving institution, with the second largest Hispanic serving engineering school in the U.S. and with 35% female enrollment. This AREA project provides a unique opportunity to train students in social media analysis, big data systems, machine learning, and predictive analytics.
U.S. health officials are struggling to manage health conditions, and disease outbreaks (e.g., flu) affecting underserved communities. Yet, early warnings can be found in public postings made by citizens of these communities in social apps like Twitter. In this project we shall build an open-source system to search public Twitter posts, and analyze their contents in search for clues about health conditions, thereby understanding the health issues affecting underserved communities, and making predictions about possible health conditions that might affect them in the future.
Rodríguez-Martínez, Manuel (2017) Experiences with the Twitter Health Surveillance (THS) System. Proc IEEE Int Congr Big Data 2017:376-383 |