The NSF Convergence Accelerator supports use-inspired, team-based, multidisciplinary efforts that address challenges of national importance and will produce deliverables of value to society in the near future. Cyber attacks on enterprise networks pose a tremendous threat to business operations today. Defending against the ever-changing landscape of threats and normal user traffic is time-consuming and labor-intensive. To address this challenge, there is an ongoing effort across many sectors to adopt artificial intelligence and machine learning (AI/ML) models to automate security incident detection and response. In practice, however, there are two roadblocks to AI/ML-enabled workflows: (1) lack of sufficient data to train a reliable model to detect new attack campaigns or model normal behaviors; (2) lack of confidence in model outputs over a short timeframe, inducing undesirable tradeoffs between false positives (i.e., blocking legitimate users) and false negatives (i.e., missing attacks). Ideally, sharing data would help address both of these problems, however this information is rarely shared (if at all) due to concerns about consumer or business privacy, and what is shared in many cases is anonymized in such way that the data loses its value. This project will create new capabilities for sharing detailed yet privacy-preserving information about security incidents that will substantially alter the data-sharing pipeline, both within and across organizations and accelerate the industry transition to AI-driven security workflows. Having better AI-driven cybersecurity tools will have an enormous impact in protecting critical infrastructure and networks across all sectors from cybers attacks.
This project will take an interdisciplinary approach spanning AI/ML, security, privacy, networked systems, law, and policy. It will tackle the fundamental tradeoffs among privacy, utility, and efficiency along three key thrusts: (1) design and implement novel generative adversarial networks (GANs) by which an enterprise can model its network data to inform anomaly detection by others. This thrust will design and implement novel GANs and analyze their privacy implications and their utility for use by others to detect malicious network activity. (2) Design and implement new cryptographic protocols and systems workflows for efficiently comparing hypotheses (suspicious identifiers, such as domain names, IP subnets, and program hashes) across enterprises to inform policy deployments. (3) Develop new legal and policy analyses on the implications of sharing such synthetic data, ML models, and hypotheses. By addressing these three critical areas and engaging key stakeholders, the tools developed by this project stand a high probably of gaining adoption and having tremendous value to the country by improving cybersecurity.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.