Often referred to as microblogging, the practice of average citizens reporting on activities "on-the-ground" during a disaster is increasingly common. The contents of these message are potentially valuable to responder organizations and victims, but their volume makes it difficult to separate valuable messages from the stream. This project will examine microblogged messages sent during disasters to determine what aspects of the messages (individually and collectively) indicate that they are relevant, verifiable and actionable. Factors to be considered include the content of the messages, the identity of the sender and the overall pattern and spread of messages. The identified factors will then be used to instruct crowdsourced workers who will label messages to create a large corpus of labelled messages.
The project is important because microblogging data are seen as increasingly important: they are ubiquitous, rapid and accessible, and they are believed to empower average citizens to become more situationally aware during disasters and to coordinate to help themselves. The result of the project, if it is successful, will be evidence that it is possible to identify relevant, verifiable and actionable messages from a stream of microblogged messages and identification of the evidentiary factors. A further outcome will be a disaster-related, labeled dataset of messages, which will be useful to researchers, e.g., those seeking to automatically classify information within a microblogged data stream.