Fairness

Algorithms are increasingly being used for decision making and risk assessment in a wide range of applications, from criminal justice to housing and employment. While algorithms are sometimes treated as impartial, it is important to evaluate how unbiased these systems really are.

It is also important to appreciate that discrimination is not necessarily negative. In some cases discrimination is necessary or even desired. Whether each case of discrimination is problematic depends on the situation in question. These metrics also cannot determine if discrimination is intentional or unintentional.

WARNING: This is a work in progress! Stay tuned.

Metrics

These are a set of metrics contained in fairness/metrics.py that can be used to examine inputs and outputs to evaluate if there is discrimination present with respect to some protected class.

Disparate Impact

This metric is based on the legal term "disparate impact" and its corresponding 80 percent rule.

Statistical Parity / Group Fairness

Based on Zemel et al., this metrics characterizes the degree of discrimination between groups, where groups are defined with respect to some protected class.

Individual Fairness

Also based on Zemel et al., this metric characterizes the degree to which similar people are characterized similarly [TODO].

Fairness in Errors

This metric measures if the algorithm in question has a higher error rate with respect to the protected class. Unevenly distributed errors are another way that algorithms can be unfair.

Examples

Clone the repo and make sure the requirements (in requirements.conda) are satisfied.

You can then use metrics to test how fair the algorithm's predictions are:

In [2]: from fairness import metrics

In [3]: data = pd.read_csv('tests/validation.csv', index_col=0) 

In [4]: protected_status = data['gender'].values

In [5]: model_outcome = data['prediction'].values

In [6]: metrics.group_fairness(protected_status, model_outcome)
Out[6]: 0.0

You can infer then that this algorithm is then not-discriminatory with respect to gender.

Auditing

This is a testing web application contained in auditing/ enabling simple auditing of model outputs for fairness.

First, upload your model outputs in csv format:

Next, run the audit and examine each metric. Here is an example for an example model, demonstrating that there may be concerns with respect to the classification of women:

Tests

Unit tests in tests make use of an artificial dataset tests/validation.csv. The tests are run with nosetests -v.

References

Zemel, Richard S., Wu, Yu, Swersky, Kevin, Pitassi, Toniann and Dwork, Cynthia. "Learning Fair Representations.." Paper presented at the meeting of the ICML (3), 2013.

redshiftzero / fairness Goto Github PK

fairness's Introduction

Fairness

Metrics

Disparate Impact

Statistical Parity / Group Fairness

Individual Fairness

Fairness in Errors

Examples

Auditing

Tests

References

fairness's People

Contributors

Stargazers

Watchers

Forkers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent