Data-driven modeling has moved beyond the realm of consumer predictions and recommendations into areas of policy and planning that have a profound impact on our daily lives. The tools of data analysis are being harnessed to predict crime, select candidates for jobs, identify security threats, determine credit risk, and even decide treatment plans and interventions for patients. Automated learning and mining tools can crunch incredible amounts and variety of data in order to detect patterns and make predictions. As is rapidly becoming clear, these tools can also introduce discriminatory behavior and amplify biases in the systems they are trained on. In this project, the PIs will study the problems of discrimination and bias in algorithmic decision-making. By studying all aspects of the data pipeline (from data preparation to learning, evaluation, and feedback), they will develop tools for analyzing, auditing, and designing automated decision-making systems that will be fair, accountable, and transparent. As specific goals to broaden the impact of this research, the PIs will develop a course curriculum to educate the next generation of data scientists on the ethical, legal, and societal implications of algorithmic decision-making, with the intent that they will then take this understanding into their jobs as they enter the workforce. Initial efforts by the PIs have attracted students from underrepresented groups in computer science, and they will continue these efforts. The PIs will also explore the legal and policy ramifications of this research, and develop best practice guidelines for the use of their tools by policy makers, lawyers, journalists, and other practitioners.

The PIs will explore the technical subject of this project in three ways. Firstly, they will develop a sound theoretical framework for reasoning about algorithmic fairness. This framework carefully separates mechanisms, beliefs, and assumptions in order to make explicit implicitly held assumptions about the nature of fairness in learning. Secondly, by examining the entire pipeline of tasks associated with learning, they will identify hitherto unexplored areas where bias may be unintentionally introduced into learning as well as novel problems associated with ensuring fairness. These include the initial stages of data preparation, various kinds of fairness-aware learning, and evaluation. They will also investigate the problem of feedback: when actions based on a biased learned model might cause a feedback loop that changes reality and leads to more bias.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
1633400
Program Officer
Sylvia Spengler
Project Start
Project End
Budget Start
2016-09-01
Budget End
2019-08-31
Support Year
Fiscal Year
2016
Total Cost
$296,616
Indirect Cost
Name
Data & Society Research Institute
Department
Type
DUNS #
City
New York
State
NY
Country
United States
Zip Code
10003