Machine learning is fueling major advances in biomedical research, natural language processing, image recognition, self-driving vehicles, etc. These advances depend on the continuing availability of data. By assuring the integrity and privacy of both the data and the machine learning models based on this data, this project aims to bring the benefits of machine learning to all data holders. However, machine learning can unintentionally reveal sensitive data such as the identity of specific persons. This project will investigate unintended learning, i.e., what machine learning models are discovering beyond their stated objectives, and its consequences for the data on which the models are trained and to which they are applied.

In this project, the first focus area is developing inference techniques that detect leakage of sensitive data and, more generally, determine what models are actually learning. This research will help identify the root causes of training-data memorization in deep models, develop methodology for detecting and measuring it, and help prevent deep models from unintentionally learning privacy-sensitive features of the data. The second focus area is developing and analyzing methods for mitigating unintended learning and ensuring that models do not contain unwanted or malicious functionality. In addition to improving privacy of the training and test data, this research will help detect and prevent backdoors in models trained on smartphones and other edge devices. All technologies developed as part of this project will be evaluated on state-of-the-art image-analysis and natural-language models.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Network Systems (CNS)
Type
Standard Grant (Standard)
Application #
1916717
Program Officer
Sara Kiesler
Project Start
Project End
Budget Start
2019-10-01
Budget End
2022-09-30
Support Year
Fiscal Year
2019
Total Cost
$499,997
Indirect Cost
Name
Cornell University
Department
Type
DUNS #
City
Ithaca
State
NY
Country
United States
Zip Code
14850