Cyber-violence is increasing exponentially as social networking applications such as Instant Messaging, Facebook, and MySpace are developed, deployed, and reach increasingly younger users. These young users frequently fall victim to cyber-predators and cyber-bullies. To address this ongoing concern, this research project will study of the communicative strategies employed by both aggressors and victims in cases of online predation and cyber-bullying. The primary outcome of this project is the development of theoretical communicative models and technology for the detection of online predation and cyber-bullying. In addition to flagging aggressive communication and notifying parents, the open-source software developed with these funds will suggest appropriate responses so a teen or tween can immediately defend him or herself against an aggressor. The response software will also subtly teach effective defense communication strategies that children can take into other situations (in the real or online world). Existing data resources for research in this area are scant and problematic. The data to be collected and disseminated as part of this project can be used to for future research in the development of theoretical communicative models of online predation and response, and of cyber-bullying communication and response. A data set for research in resolving multiple Internet identities will also be created and distributed.
This research project will bring together two fields, Computer Science and Media and Communication Studies, which have been dramatically impacted by the explosive growth of Internet social networking sites. The study will build upon existing web-centered technologies and tools, as well as theories of communication that were developed by close analysis of other media forms, such as television and print media. Existing machine learning algorithms will be enhanced by the development and integration of communicative theories that have been updated for online interactions. The focus on cyber-violence, especially cyber-violence directed at children, supplies a socially relevant test-bed that has suffered from neglect by researchers in Computer Science, primarily due to the lack of standard data sets. This project will provide collections of annotated data that will be used by other researchers in both fields.
Cyber-violence among youths is increasing exponentially as social networking applications such as Facebook and Twitter reach increasingly younger users on a growing number of platforms like iPads and smartphones. These young users frequently fall victim to cyber-predators and cyber-bullies, sometimes with tragic results. Our grant, "Tracking Predators and Bullies Via Chat Log Transcripts," sought to provide solutions for protecting children against cyber-bullying and cyber-predation. Using communicative theory, content analysis methodologies and information retrieval technologies to analyze chat logs and posts from social media sites, we were able to accomplish the goals of this project, while making our raw and labeled data available for other scholars to use. Our project began with an exploration of the face-to-face luring theory in an online environment, the Perverted Justice (PJ) website, where volunteers pose as youths in chatrooms and adults solicit them for sex. In order to test the luring theory, we developed a codebook for manual coding of sexual predators’ communicative strategies in 500+ PJ chatlogs. In the course of this analysis, unfortunately, we learned that the luring theory doesn’t explain or predict online predatory communications nor does it explain why youths are willing to reveal personal information that predators exploit – an avenue we are now vigorously pursuing. In tandem with our manual coding, we have developed a software application, ChatCoder, that uses machine learning techniques for the identification of predatory authors and posts in an online conversation. With ChatCoder we were able to detect up to 87% of the predatory authors in an unseen test set. Our project also included close analysis of the language used in cyber-bullying. We identified the most commonly used cyber-bullying words, and have identified search terms that can be used to reliably detect cyber-bullying on the social media site, Formspring.me. We also developed a new search technique that assigns scores to posts from social networking sites. The posts with the largest scores were shown to have a high density of cyber-bullying content. During the course of our grant, we have presented our findings at nine conferences in the U.S. and abroad and our work has been published in eleven articles. We also created a website, ChatCoder.com, to both provide technical support and data for the project, as well as keep interested readers informed about our progress.