High quality, real-time data is essential in public health crises. Yet, traditional survey methods that rely on random-digit-dialing are expensive, difficult to deploy instantly, and fail to sample hard-to-reach populations without landline telephones, such as young adults (18-30) and minorities. In contrast, these groups heavily use social media. Twitter, in particular, is widely available and immediate, providing a rich data source that can be used to pilot hypotheses at minimal cost. These hypotheses can then be modified prior to a more in-depth study. Social media data pose challenges for public health officials and researchers who aim to test new hypotheses and policies. These challenges are related to the size of the dataset and the difficulty filtering and validating these data. We will therefore develop and test an innovative computational tool that overcomes these challenges. This tool will supplement traditional survey techniques by facilitating real-time data gathering and rigorous quantitative analysis of social media data related to health narratives, attitudes, and behaviors. We will validate our tool by comparing existing survey data to social media data about influenza vaccination among adults 18-30, adult African Americans, and non-White Hispanics of all ages - three demographic categories with the highest rates of social media use, lower rates of participation in survey research, and lowest rates of seasonal flu vaccination. Thus, our tool will enable theory building. We will test hypotheses derived from the health communication literature, especially regarding how group attitudes form and change, categorize attitudes and collective narratives by existing theories and conceptual models, and build new theory to capture emerging and previously unidentified concepts. Finally, we disseminate our results and novel techniques using a website, vaccinetrends.org, that provides processed social media data to the research community. Our approach offers inexpensive, immediate access to the attitudes of these groups, transcending traditional constraints of time, money, and data access. Our approach is novel because it combines the strengths of social media analysis with those of validated survey techniques. We will draw upon two complementary population samples, representing different timescales and demographics, in order to test hypotheses in a manner that is rapid yet rigorous. In addition, our social media analysis will draw upon novel techniques to infer demographic information and social group membership, enabling the extraction of master narratives - attitudes and content that are associated with rationales for vaccine refusal and, ultimately, behavior. In addition, we will develop tools and techniques that can be adopted by researchers throughout the social, computer, and health sciences. Finally, we draw upon a much more extensive data source than has been found in previous work, including billions of Twitter messages and public forum information that will enable in-depth automated content analysis of vaccine refusal rationales.
This project addresses challenges that social media data pose for public health officials and researchers who aim to test new hypotheses and policies. These challenges are related to the size of the dataset and the difficulty filtering and validating these data. The proposed studies will develop, test, and validate an innovative computational tool that will supplement traditional survey techniques by facilitating real-time data gathering an rigorous quantitative analysis of social media data related to health narratives, attitudes, and behaviors.
Showing the most recent 10 out of 12 publications