The Internet has evolved into an essential medium that permeates most aspects of our lives. We consume information and entertainment online, cultivate relationships, exchange ideas, and handle business transactions. Not surprisingly, this new medium has also attracted malicious elements who seek to use the Internet to take advantage of others. Information manipulation is a new, emerging frontier in cyber security. Information manipulation denotes all attempts by adversaries to distort information with the goal to influence opinion, thought, or action. It can take many shapes and forms, from blatant attacks, such as search-poisoning, to misinformation, such as bogus on-line reviews, and more subtle distortion, such as personalized search and biased news. Unlike more traditional attacks, which typically aim to take control of computational resources or sensitive data, information manipulation targets human minds and their ideas. Left unchecked, information manipulation can harm our economy, culture, and democracy.
In this research project, the PIs aim to systematically study the ways in which attackers can manipulate information along its flow from the source where it is created to the recipient. Of particular interest are systems that help to discover, organize, and present information to users. These systems, such as search engines and news portals, reach large audiences and act as filters that often determine what content users will see or not. Thus, attackers can achieve significant leverage when successfully manipulating the filter mechanisms to their benefit. As one example, attackers can carry out search engine poisoning attacks to trick search engines into ranking their content higher than it should be based on its organic value. However, attackers do not need to target search engines directly; it is also possible to manipulate ranking by targeting users of search engines and their search history. Based on the study and analysis of attacks, the PIs will develop general detection approaches to identify when systems are under attack. This information can then be leveraged to design appropriate countermeasures.
Information manipulation is a new, emerging frontier in cyber security. Information manipulation denotes all attempts by adversaries to distort information with the goal to influence opinion, thought, or action. It can take many shapes and forms, from blatant attacks, such as search-poisoning, to misinformation, such as bogus on-line reviews, and more subtle distortion, such as personalized search and biased news. Unlike more traditional attacks, which typically aim to take control of computational resources or sensitive data, information manipulation targets human minds and their ideas. In this research, we explored the problem of information manipulation attacks along two main vectors. First, we looked into the problem of fake reviews in popular websites. Sites such as Yelp and TripAdvisor allow anonymous users on the Internet to create accounts and rate their experience with products and services. Clearly, there is an incentive for businesses to make their products appear better than they are (or to slender competition). In our project, we analyzed review entries and uses anomaly detection to determine whether certain entries are likely fraudulent. To this end, we use a number of techniques that leverage spatial and temporal correlation. More specifically, we are interested in differences between multiple reviewing sites for a specific business (spatial correlation). When reviews on one site deviate substantially from those on other sites, reviews are possibly fraudulent. We also check whether the reviews for a specific business over time change suddenly (temporal correlation). Again, a sudden shift combined with a substantial increase in the number of reviews for a business is suspicious. Finally, we also consider the accounts created by reviewers. For example, reviewers who do not live in the area or have only reviewed a small number of businesses will be considered less trustworthy. We also worked on a study that identified ways in which websites can implement web fingerprinting. Web fingerprinting is a term that describes techniques that web servers can use to track clients (web browsers). Tracking is used for many different reasons, including targeted advertising and preventing fraud. The most prominent fingerprinting mechanism are browser cookies. However, given their negative publicity and initiatives such as do-not-track, cookies become increasingly less reliable. Hence, companies (and attackers) search for alternatives. Our study led to the discovery of various novel approaches to carry out fingerprinting, but it also revealed the extent to which entities on the web make use of these (aggressive) techniques. As part of our research, we discovered novel browser-fingerprinting techniques that can, in milliseconds, uncover a browser's family and version. Finally, we demonstrate that over 800,000 users who are currently utilizing user-agent-spoofing extensions are more fingerprintable than users who do not attempt to hide their browser's identity, and hence, challenge the advice given by prior research on the use of such extensions as a way of increasing one's privacy. Our research uncovered important information manipulation attacks and potential privacy invasions in two different domains. This project was a one-year, initial exploration of the domain. We believe that we set out to investigate a number of interesting problems that will yield additional fruitful research problems (and results) in the years to come.