SaTC: CORE: Medium: Hidden Rules in Neural Networks as Attacks and Adversarial Defenses

Zhao, Ben; Zheng, Haitao; Lopes, Pedro

Abstract

Recent advances in Deep Neural Networks (DNNs) have enabled significant progress in technological challenges such as voice/facial recognition, language translation and image recognition. Yet DNNs remain vulnerable to a class of hidden attacks called "backdoor" or "Trojan" attacks, where hidden rules are trained into a model which only become active on model input with some unusual properties, comprising a "trigger." They are strong enough that the presence of a small, inconspicuous trigger can make the model produce unexpected (and often erroneous) results, e.g., recognize anyone with a black ankh tattoo as a predetermined celebrity. Despite recent efforts, these attacks remain poorly understood, and robust defenses remain elusive. This project studies this class of attacks in depth to understand their potential impact on real machine learning systems and potential defenses.

More specifically, the project will first catalog the breadth of backdoor attacks across multiple domains (and potential defenses), including images (facial and object recognition), text (natural language processing and sentiment analysis), and audio (speaker recognition and voice transcription). The project will then explore their practical implications outside the digital domain, including backdoor attacks in the physical world (such as on facial recognition), and advanced backdoors that coexist with transfer learning, the prevailing method for sharing DNN models today. Finally, the project will explore potential positive uses of backdoors as model-training tools, spawning a novel protection mechanism for DNN models, by trapping adversarial attacks with honey-pots built using backdoor techniques. The techniques will incorporate evaluation of both advanced attacks and defenses across a broad range of applications, datasets and models, and whenever possible, experiments in the physical domain. Successful results from this project should alert security professionals to the risk of backdoors in DNNs, while providing the software and algorithmic tools necessary for robust defenses.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Computer and Network Systems (CNS)
Type: Standard Grant (Standard)
Application #: 1949650
Program Officer: Phillip Regalia

Project Start
Project End
Budget Start: 2020-03-01
Budget End: 2024-02-29
Support Year
Fiscal Year: 2019
Total Cost: $1,248,000
Indirect Cost

SaTC: CORE: Medium: Hidden Rules in Neural Networks as Attacks and Adversarial Defenses
Zhao, Ben Zheng, Haitao Lopes, Pedro
University of Chicago, Chicago, IL, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments