CRII: III: Fair Machine Learning with Restricted Access to Sensitive Personal Data

Lan, Chao

Abstract

Machine learning is increasingly applied to assist consequential decision makings, typically by learning a model to automatically score people's potential and prioritizing advantaged decisions on those receiving higher scores. While this enables more efficient and evidence-based decision makings, recent studies show that many model scorings are biased against minority people and can result in negative societal impacts. This has triggered intensive research interests in developing fair machine learning techniques that can mitigate demographic bias in model scoring. The problem is, existing developments are running into a conflict with data privacy regulations. To be specific, most fair learning techniques require free access to one's sensitive demographic data, but the latter is increasingly restricted to use for protecting one's privacy. There are ongoing debates on whether it is permissible or necessary to use sensitive demographic data in fair machine learning, but so far no consensus has been reached due to the lack of scientific investigations. This project aims to fill this gap, not only for establishing a fundamental relation between fairness and privacy but also for broadening the deployment and impact of fair learning techniques in real-world applications; the project will also have an important educational impact via the involvement of underrepresented students in computer science research and the creation of a new curriculum on ethical machine learning to train the next-generation ethics-aware data scientists.

This project will develop novel fair machine learning techniques with restricted access to sensitive demographic data (SDD). Three scenarios will be formulated and solved: where SDD is not accessible, where SDD can be accessed with cost, and where SDD can be accessed through a private third party. To tackle these scenarios, this project will integrate fairness objectives with a variety of sophisticated learning techniques including transfer learning, active learning, distributed learning and private learning. The project will also investigate a fundamental relation between fairness and privacy in machine learning, that is, how much fairness can be achieved in a model's scoring if one has to protect certain privacy of the sensitive demographic data when learning the model. The developed solutions will be presented in conference and journal venues, and the project website will provide access to the results, with references to the codes for the developed and evaluated algorithms.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Type: Standard Grant (Standard)
Application #: 1850418
Program Officer: Sylvia Spengler

Project Start
Project End
Budget Start: 2019-10-01
Budget End: 2021-02-28
Support Year
Fiscal Year: 2018
Total Cost: $174,998
Indirect Cost

CRII: III: Fair Machine Learning with Restricted Access to Sensitive Personal Data
Lan, Chao
University of Wyoming, Laramie, WY, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments