This Small Business Innovation Research (SBIR) Phase I project will develop software to prevent the manipulation of consumer reviews of websites and online businesses. Consumer reviews are a vital component in educating consumers about the trustworthiness of websites. However, any platform that makes it easy for consumers to review websites also makes itself vulnerable to abuse from actors who write purposefully deceptive, self-promoting, or low-quality reviews. These manipulative reviews can mislead consumers and permanently damage the credibility of the platform on which the reviews are published. Preventing these reviews is difficult because although traditional spam filters are effective in filtering automated spam, they are unable to detect manipulative reviews written by humans. This project will assess the feasibility of building and training a customized content filter with additional heuristic algorithms incorporating community feedback, reviewer attributes, and supplemental third-party data, to effectively detect and remove both automated and human-generated manipulative reviews.

The FBI received 275,000 complaints of online fraud in 2008. The Washington Post has estimated $100 billion is lost every year in online fraud. Sites that often present the biggest risk to consumers include health information providers, paid online service providers, small retailers, and sites based outside the US. To address this problem, the company will help consumers identify the best and worst websites quickly and easily via reviews written by members of the community. A typical use case might involve a consumer who is looking to make a purchase on an obscure and unfamiliar website. Using the solution, instead of taking a risk, the consumer could look up the website in question and benefit from the experiences of other consumers to learn important information such as: whether the website is involved in any known scams, if the depictions of goods or services is consistent with what is delivered, and whether there is a better website which provides similar goods or services. If successfully deployed, the solution described in this research effort will address a significant and growing problem related to e-commerce.

Project Report

Most Americans lack a reliable resource to evaluate unfamiliar websites and online businesses. Most use a familiar site like Amazon or take a gamble on whatever comes up in a Google search. These gambles frequently result in fraud. There are now over 250 million websites and the Washington Post has estimated $100 billion is lost every year to online fraud. While these statistics cover a range of sites and types of online fraud, there is a particularly high risk of fraud around small online retailers, online pharmacies, online services, overseas counterfeiters, and work-at-home schemes. SiteJabber.com flags fraudulent websites using consumer reviews and uses that information to warn other consumers so they are not defrauded. However, owners of fraudulent websites attempt to circumvent this service by creating clones of their websites (also done by fraudsters to avoid law enforcement). This Phase I was designed to create software that can leverage consumer reviews of fraudulent sites to proactively detect clones of fraudulent sites, thereby expanding SiteJabber’scapacity to prevent online fraud. The difficulty in performing this project was in collecting and analyzing enough data on known clones, creating an effective algorithm able to accurately predict which sites are clones without implicated innocent sites, and then developing a method to fetch sites that could be potential clones. While we cannot publicly disclose the exact methods involved to prevent fraudsters from gaming the software, the Phase I study fulfilled its objective to build software to automate the detection of clones of fraudulent or otherwise harmful websites to protect consumers. The software was determined to be 96.6% accurate +/- 4% with a 95% confidence level and uncovered 2173 new fraudulent sites and flagged another 1243 for further review. This information can now be deployed to warn the half a million consumers that visit SiteJabber.com every month. Moreover, as the SiteJabber community reports more fraudulent websites, hundreds and perhaps thousands of new clones will also be found using the Phase I software.

Project Start
Project End
Budget Start
2010-07-01
Budget End
2010-12-31
Support Year
Fiscal Year
2010
Total Cost
$149,600
Indirect Cost
Name
Ggl Projects, Inc.
Department
Type
DUNS #
City
San Francisco
State
CA
Country
United States
Zip Code
94110