This project aims to address the growing threat of phishing attacks, messages that try to trick people into revealing sensitive information, by combining human and machine intelligence. Existing detection methods based on machine learning and blacklists are both brittle to new attacks and somewhat lenient, in order to avoid blocking legitimate messages; as a result, widely used email systems are vulnerable to carefully crafted phishing emails. To address this, the project team will develop systems that automatically block obvious scams while forwarding less certain cases to groups of crowd workers trained to detect phishing mails. To support these workers' decision-making, the team will develop novel explanations of the system's decision making that will highlight the aspects of both the message and its algorithm that triggered the need for human judgment. The system will also aggregate these crowd decisions to generate real-time phishing alerts that can be shared to both individual users and to email systems. The project will lead to advances in interpretable machine learning, an important topic given the increasing role that artificial intelligence and machine learning systems play in society, and also increase our ability to characterize the evolution of phishing attacks and the vulnerability of internet platforms and users to those attacks over time. The project team will also use the work as an important component of new courses on usable security and outreach programs to high school teachers and students to both educate them about and increase their participation in cybersecurity research.
The work is organized around three main objectives: empirical characterization of phishing risks, developing accurate and interpretable machine learning models for phishing detection, and developing reliable crowdsourcing systems for phishing alerts. The team will assess phishing risks through developing analytics tools on the effective adoption and configuration of anti-spoofing protocols in email systems, using adversarial machine learning methods to conduct black box testing on existing phishing detectors, and creating reactive honeypots that entice and respond to phishing attacks in order to collect data on not just the initial phishing emails but on attackers' behaviors throughout the course of a successful phishing attack. The data collected on phishing emails will be used to develop the machine learning models, using Convolutional Neural Network and Long Short-Term Memory based deep learning techniques to generate both suspicious features and confidence estimates of individual decisions. The suspicious features will be used to generate interpretable security cues such as text annotations or icons by first creating simpler and more interpretable machine learning models such as decision trees that mimic the local detection boundary near the target emails in the feature space. Rules in the decision tree will be mapped back to interface elements and email content to provide the warnings, and these will be compared to generic email security warnings in a series of user studies that also model people's ability to detect phishing using a variety of cues, features, and media. Those individual models, along with the confidence estimates from the phishing detection model, will then be used to drive a crowdsourcing-based system where the models of individual users' quality will be aggregated to make reliable judgments around emails the models judge as too suspicious to pass but not suspicious enough to automatically filter.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.