Social media sites such as Twitter, Facebook, YouTube, and Flickr host an ever-increasing amount of user content captured or produced in association with real-world events, from presidential inaugurations to community-specific events. Unfortunately, the existing tools to find, organize, and present the social media content associated with events are extremely limited. This project will address critical end-to-end information processing and presentation methods that will transform public access to real-world event information from social media sources. In particular, this work will increase the digital presence of currently underrepresented communities and address their information needs: for these communities, events are often not covered by mainstream media, but are increasingly available on social media services. As a distinctive characteristic, the project will draw on several research areas, namely, information retrieval and databases, human-computer interaction, and social media, thus contributing to educating multidisciplinary students. The PIs will continue to include undergraduate students and students from underrepresented populations in the research.

The project will result in new data analysis and visualization techniques for event-based information tasks, addressing human and computational factors in social media systems to handle vast collections of noisy, user-contributed content of widely varying structure and quality. To enable effective browsing, search, and presentation of event content, this work will use the wealth of social media documents to address several fundamental problems. The first problem is the detection of events in repositories of social media content. Such content, increasingly posted by users in real time, is noisy and highly heterogeneous, but can help in the early detection of a wide range of events of all sizes. The second problem is the comprehensive identification of content related to detected or known events, currently fragmented across social media sites and often hard to find and collect. The third problem is content presentation, which requires the development of novel presentation and visualization techniques for social media event content. The amount of content available even for a single event can be overwhelming and hinder data exploration and sense-making.

The project will create new tools that will transform the viewing experience of the event information. These tools will allow users to create and share personalized views of the event data as a story-telling practice. Finally, as a main outcome, the data used in the research will be made available to other researchers whenever possible. Moreover, another main outcome will be a publicly available prototype system based on this research, designed to help connect computing and information science challenges to the activities and natural interests of a diverse set of users.

Project Report

Increasingly, the bulk of information from local and global events is being contributed by individuals through social media channels: on social networking sites (e.g., Facebook, Twitter), as well as photo and video-sharing sites (e.g., Instagram, YouTube), and others. These events range from major global events such as the Syrian uprising or the earthquake in Haiti, to local events like protests or and music concerts, to televised events such as a Presidential speech, and more. The research in this project focused on three main threads related to events and social media: the detection of events based on social media content, the comprehensive identification of content related to events (currently fragmented across social media sites), and presentation and visualization techniques for event content that can otherwise be overwhelming in volume. The work addressed both innovations in methods and algorithms for events in social media, as well as creating applications to use and access that data, using user-centered and design approach. Ultimately, this work enables multiple stakeholders like journalists, first responders, researchers, policy makers, and the public to see and understand what happens in world events, using social media. Vox Civitas is one of the tools we developed as part of this project, in 2010. Vox is a visual analytic tool designed to help journalists and media professionals extract news value from large-scale aggregations of social media content around broadcast events. Vox was a pioneering tool that aided the sensemaking and source-finding process for those who turn to real-time event information from Twitter. Following Vox, the SRSR project developed more tools to assist the use of social media in news reporting during breaking events. Social media offers an?opportunity for journalists to? reach beyond their typical source networks of elite or ?otherwise affiliated sources.? We created one of the first computational tools to address the ?opportunity social media ?offers for journalists to find? and assess information? sources. SRSR (Seriously ?Rapid Source Review), introduced a number of novel features, machine learning techniques and visual analytics to help journalists evaluate and identify a verity of sources in social media. CityBeat is another tool created in this project, that used social media to help journalists find and use information – in this case, help them identify hyper-local events in a given geographic area (New York City). The main objective of CityBeat is to provide users – with a specific focus on journalists – with information about the city’s ongoings, and alert them to unusual activities. The system collects a stream of geo-tagged photos as input, and uses time series analysis and machine learning techniques to detect hyper-local events. The system also computes trends and statistics about the data for the real-time display. The system, designed as a large-screen ambient visualization, was made publicly available, and was in use at several top newsrooms in New York City. Beyond the implemented systems, this project contributed many new algorithms and methods for handling social media data for events. Significant, early contributions were made to the research area of detecting events in social media, to identifying the related content for each event, and to clustering and organizing the information in each event. This project started before the Arab Spring events had demonstrated the power of Twitter and other social media as a prominent source of information about the world’s events. In a short few years, the ideas and methods developed in this work had become important not only academically, but in real-world settings where multiple organizations are monitoring and making sense of social media data for events.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
1444493
Program Officer
Sylvia Spengler
Project Start
Project End
Budget Start
2013-09-01
Budget End
2014-08-31
Support Year
Fiscal Year
2014
Total Cost
$113,512
Indirect Cost
Name
Cornell University
Department
Type
DUNS #
City
Ithaca
State
NY
Country
United States
Zip Code
14850