Most websites use javascript to provide personalized content to the users. At the same time, more and more attackers are using the web to deliver their attacks, especially with malicious javascript. Malicious javascript detection needs to be fast enough so that it does not interfere with users' normal activities (non-invasive), and yet effective enough to protect them from the majority of attacks. Rule-based or signature-based detection mechanisms often fail to detect obfuscated and malicious javascript. Behavior-based detection mechanisms are more robust against obfuscation, and have been effective in identifying variants of known attacks. However, monitoring behavior during execution usually is rather invasive, as it requires too much time and resources to be used in web browsers while users are interacting with websites.

This project investigates non-invasive detection of malicious javascript using classifiers (data mining techniques) trained on malicious scripts, including obfuscated scripts. Preliminary results show that it is possible to detect the vast majority of malicious scripts without full-blown de-obfuscation, while labeling very few benign scripts as malicious. As the detection mechanism correctly identifies most benign scripts, resource-intensive detection mechanisms can use this method to filter most benign scripts and focus on the remainder only.

Key elements of the envisioned solutions are: (a) automatic collection of malicious javascript; (b) partial de-obfuscator that will extract features for classifiers; (c) classifiers that assess the maliciousness of scripts; (d) redirection graphs that chronicle the connections between websites hosting known malicious scripts; (e) feedback mechanism to assist javascript collection and classifier re-training.

Project Report

Today, more than ever, there is a critical need for secure web-browsing environment. Internet users rely on the web more and more to find information, to socialize, to conduct business, and in many other aspects of their lives. However, more attackers use the web to deliver their attacks, especially with malicious javascript. Therefore, it is critical to develop malicious javascript detection that is fast enough so that it does not interfere with users’ normal activities, i.e. non-invasive, and yet effective enough to protect them from the majority of attacks. Malicious javascript often utilizes obfuscation to hide known exploits and prevent rule-based or regular expression (regex)-based anti-malware software from detecting the attack. The complexity of obfuscation techniques varies from using un- common encoding to modifying control flows. For instance, attacks often include references to legitimate companies to disguise their purpose and include context-sensitive information in their obfuscation algorithm. The complexity increase raises the resources necessary to deobfuscate the attacks, potentially to the point that full de-obfuscation before detection becomes implausible. Behavior-based detection mechanisms are more robust against obfuscation, and have been effective in identifying variants of known attacks. However, mon- itoring behavior during execution usually requires too much time and resources to be used in web browsers while users are interacting with websites. We investigated non-invasive detection of malicious javascript using classifiers trained on features present in malicious scripts. We designed and developed a comprehensive framework to solve the problem of non-invasive malicious javascript detection at web browsers. Key components of our framework are: (a) classifiers that assess the maliciousness of scripts; (b) conversational user-interface that guides users determine when to interact with suspicious javascripts; and (c) user study protocol that showed the effectiveness of the user interface gardless of the computer fluency of the participants. The impact of this project is three-fold: 1) Without using deobfuscation techniques nor modifying JavaScript engine, the detection rate can be higher than 90% per script. In other words, this project's outcome can be applied to any web browser on any platform, as long as there is Java support. Also, users can still interact with the website without running suspicious scripts - this improves the usability significantly. 2) The user study shows that the conversational user interface is effective in discouraging users from running suspicious scripts, and also has educational effect. Study participants reported that they were more likely to cautious while surfing the web, knowing that malicious JavaScript is mostly invisible. 3) The user study with the unique population, poor and low in computer fluency, shows that the conversational user interface has similar effects with this population as well. We have engaged service learning students in the user study and presented the experience at a service learning conference. In particular, these students were communications major students so the user study became interdisciplinary service learning research project, which is highly unusual. The poster explained how to identify potential projects for interdisciplinary service learning course and was well received.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Network Systems (CNS)
Application #
1063745
Program Officer
M. Mimi McClure
Project Start
Project End
Budget Start
2010-09-01
Budget End
2014-05-31
Support Year
Fiscal Year
2010
Total Cost
$142,050
Indirect Cost
Name
University of San Francisco
Department
Type
DUNS #
City
San Francisco
State
CA
Country
United States
Zip Code
94117