Code Monkey home page Code Monkey logo

gan-debiasing's People

Contributors

sunniesuhyoung avatar vramaswamy94 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

gan-debiasing's Issues

Wrong notation for Bias Amplification in the paper

Hi,
in the paper page 5, there is this explanation about Bias Amplification:

For each pair of target and protected attribute values, we add (Pt|g − Ptˆ|g)
if Pt,g > PtPg and −(Pt|g − Ptˆ|g) otherwise.
where
Pt|g be the fraction of images with protected attribute g that have target attribute t, 
Ptˆ|g be the fraction of images with protected attribute g that are **predicted** to have target attribute t,

However, based on the original paper of Directional Bias Amplification
image

I think the correct notation should be:

we add (Pt^|g − Pt|g) if Pt,g > PtPg and −(Pt^|g − Pt|g) otherwise.

since we are measuring A -> T.

The code also suggests that the difference is from (Pt^|g − Pt|g).

diff[i][j] = pred_bog[i][j] - data_bog[i][j]

Did I understand it correctly? or maybe I missed something. Thanks!

FileNotFoundError: [Errno 2] No such file or directory: 'data/fake_images/all_Male_scores.pkl'

Hi All,
I've recently started working on checking the Fairness of the model on visualization tasks.

while running the linear.py code (gan-debiasing project) I got the exception telling FileNotFoundError: all_Male_scores.pkl.

please let me know if I'm doing any mistakes running in the below procedure :

  1. Downloaded CelebA dataset and put it in data/celeb
  2. Ran crop_images.py to crop the images 128*128
  3. Ran main.py --experiment baseline to train a standard attribute classifier for each target attribute
  4. Ran generate_images.py --experiment orig
  5. Ran get_scores.py -->Note: it was only generating the all_Smiling_scores.pkl file . it was not generating the 'all_Male_scores.pkl' file.
  6. Ran linear.py --> throws an exception( "all_Male_scores.pkl file not found error")

Thanks.

Query: Requirements/dependencies

Respected sir
While implementing the repository I was unable to find the related text file with the requirements/dependencies for the code to run. I request you to please guide me.

Edit: Do you recommend installing the dependencies given in the reference coursera google colab notebook

# Import libraries.
import numpy as np
np.random.seed(123)
from sklearn import svm
import matplotlib.pyplot as plt

Crash main.py#144

deo, deo_std = utils.bootstrap_deo(val_targets[:, 1], val_targets[:, 0], val_pred)

bootstrap_deo doesnt have a default value for repeats

Training time

Hi!

Could you please add a list of estimated run times for each of the operations/training steps in the list?

Many thanks,
Dominik

function create_dataset_all in load_data.py

In my opinion, this function is for building a dataset to integrate both fake images and original training data.
The code part that I have questions about is
labeldata = pickle.load(open(fake_params['attr_path'], 'rb')) labeldata = np.tile(labeldata, 2)

The range of labeldata is (0, 175000) before np.tile. However, when assigning the label to fake images, you assign the samples in (15000, 175000) as lables in (0, 160000). Is it a misalignment?
Or there must be some unknown extra preprocessing in "all_{}_scores.pkl".

Should I process it as the domdata?

FileNotFoundError: [Errno 2] No such file or directory: 'data/fake_images/Smiling_scores.pkl'

Hello,

I was trying to run your code and I have been struggling with an error for the past few hours.

This is the error that I met when I tried to run python main.py --experiment model:
FileNotFoundError: [Errno 2] No such file or directory: 'data/fake_images/Smiling_scores.pkl'

I have followed your steps on the readme and I haven't found out where in the code this file is being created.

Full output:
{'experiment': 'model', 'experiment_name': '', 'real_data_dir': 'data/celeba', 'fake_data_dir_orig': 'data/fake_images/AllGenImages/', 'fake_data_dir_new': '', 'fake_scores_target': '', 'fake_scores_protected': '', 'cuda': True, 'random_seed': 0, 'attribute': 31, 'protected_attribute': 20, 'test_mode': False, 'num_train': 160000, 'number': 0, 'device': device(type='cuda'), 'dtype': torch.float32, 'print_freq': 100, 'total_epochs': 20, 'save_folder': 'record/model\Smiling', 'optimizer_setting': {'optimizer': <class 'torch.optim.adam.Adam'>, 'lr': 0.0001, 'weight_decay': 0}, 'dropout': 0.5, 'data_setting': {'real_params': {'path': 'data/celeba', 'attribute': 31, 'protected_attribute': 20, 'number': 0}, 'fake_params': {'path_new': 'data/fake_images/Smiling/', 'path_orig': 'data/fake_images/AllGenImages/', 'attr_path': 'data/fake_images/Smiling_scores.pkl', 'dom_path': 'data/fake_images/all_Male_scores.pkl', 'range_orig_image': (15000, 175000), 'range_orig_label': (160000, 320000), 'range_new': (0, 160000)}, 'augment': True, 'params_train': {'batch_size': 32, 'shuffle': True, 'num_workers': 0}, 'params_val': {'batch_size': 64, 'shuffle': False, 'num_workers': 0}}}
Traceback (most recent call last):
File "main.py", line 197, in
main(opt)
File "main.py", line 63, in main
train = create_dataset_all(
File "load_data.py", line 60, in create_dataset_all
labeldata = pickle.load(open(fake_params['attr_path'], 'rb'))
FileNotFoundError: [Errno 2] No such file or directory: 'data/fake_images/Smiling_scores.pkl'

Observed Fairness was not dropped for baseline model

Hi ,

Baseline model fairness was not dropped for the Male is protected attribute and Smiling is the target attribute.

I've trained the standard attribute classifier using the below command python main.py --experiment baseline
As per the code, The baseline model is trained on the

Dataset: CelebA training dataset X with 162,770 images,
Hyperparameters: Binary cross-entropy loss for 20 epochs with a batch size of 32, and using Adam optimizer with a learning rate of 1e-4.
Target attribute: Smilling (attribute 31)
Protected attribute: Male (attribute 20)

Fairness metrics for Baseline model :

Training epoch 19: [5001|5087], loss:0.07021914422512054
Avg precision all = 0.9827746330855125
Validation results:
AP : {:.1f} +- {:.1f} 98.52633278341692 0.13046754108938044
DEO : {:.1f} +- {:.1f} 3.3444695945977543 1.1048136540690787
BA : {:.1f} +- {:.1f} 0.06943005310537514 0.3600587245308783
KL : {:.1f} +- {:.1f} 0.01865371993659408 0.03730743987318816
Test results:
AP : {:.1f} +- {:.1f} 98.43127724632386 0.1274417440206799
DEO : {:.1f} +- {:.1f} 2.7732756124918376 1.1788377126697231
BA : {:.1f} +- {:.1f} -0.5825026659862016 0.4052127065723986
KL : {:.1f} +- {:.1f} 0.020252475423118695 0.04050495084623739

if you observe the above fairness matrics (AP, DEO, BA, KL) looks fine for the baseline model,
but the result is shown in the paper for the baseline model is different (DEO, BA, KL is high).

As per my understanding the standard classifier fairness metrics: DEO, BA, KL should be high.

do I need to change anything to train the standard attribute classifier to reproduce the paper results(fairness)?

Please let me know whether I'm doing any mistakes.
Expecting your response.

How to run `main.py` file for both target and protected attribute

Hi,

get_scores.py takes in a command line argument with the attribute. You need to (1) run main.py for both the protected and target attributes, and (2) run get_scores.py with both the protected and target attributes.

Originally posted by @vramaswamy94 in #8 (comment)

can you please provide the steps on how to run main.py for both target and protected attributes (commands to run the main.py file ).

Thank you in advance.

FileNotFoundError: [Errno 2] No such file or directory: 'data/fake_images/Straight_Hair_scores.pkl'

Hello, I was trying to run your code and I have been struggling with an error for the past few hours. This is the error that I met when I tried to run python main.py --experiment model. I saw your previous response, but I have a confusion. You said that running get_scores.py and change the path to the location where the scores are stored could solve it. Should I need to generate scores for the newly generated images "data/fake_images/Straight_Hair/"? If I only modify the out_file parameter, it seems like get_scores.py is still hallucinating scores for the original images "data/fake_images/AllGenImages/" instead of the newly generated ones. What command should I need to run to generate the final Straight_Hair_scores.pkl file that I want? It seems like python get_scores.py --attribute 32 --out_file data/fake_images/Straight_Hair_scores.pkl doesn't achieve the desired outcome I'm aiming for.

Some Questions about Reproducing Experimental Results

I have faithfully followed the README instructions to reproduce the experiment, selecting the target attribute as "StraightHair". However, the baseline and final training results are as follows:
image
image
This is the improvement over baseline:
01
When comparing them to the results mentioned in the paper, I notice a significant difference in terms of both metrics and improvement levels:
image
image
How can I address this issue?

f1_score and f1_thresh swapped?

in main.py line 139
f1_score, f1_thresh = utils.get_threshold(val_targets[:, 0], val_scores)

f1_score,f1_thresh = utils.get_threshold(val_targets[:, 0], val_scores)

However, in utils.py line 164
return best_t, best_acc

return best_t, best_acc

it returns threshold first, instead of score.

Is this a mistake?
because f1_thresh is needed to hallucinate labels in get_scores.py

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.