princetonvisualai / gan-debiasing Goto Github PK
View Code? Open in Web Editor NEWFair Attribute Classification through Latent Space De-biasing (CVPR 2021)
Fair Attribute Classification through Latent Space De-biasing (CVPR 2021)
Hi,
in the paper page 5, there is this explanation about Bias Amplification:
For each pair of target and protected attribute values, we add (Pt|g − Ptˆ|g)
if Pt,g > PtPg and −(Pt|g − Ptˆ|g) otherwise.
where
Pt|g be the fraction of images with protected attribute g that have target attribute t,
Ptˆ|g be the fraction of images with protected attribute g that are **predicted** to have target attribute t,
However, based on the original paper of Directional Bias Amplification
I think the correct notation should be:
we add (Pt^|g − Pt|g) if Pt,g > PtPg and −(Pt^|g − Pt|g) otherwise.
since we are measuring A -> T.
The code also suggests that the difference is from (Pt^|g − Pt|g).
Line 256 in 6cec2a2
Did I understand it correctly? or maybe I missed something. Thanks!
Hi All,
I've recently started working on checking the Fairness of the model on visualization tasks.
while running the linear.py code (gan-debiasing project
) I got the exception telling FileNotFoundError: all_Male_scores.pkl.
please let me know if I'm doing any mistakes running in the below procedure :
CelebA
dataset and put it in data/celebcrop_images.py
to crop the images 128*128generate_images.py --experiment orig
get_scores.py
-->Note: it was only generating the all_Smiling_scores.pkl file . it was not generating the 'all_Male_scores.pkl' file.linear.py
--> throws an exception( "all_Male_scores.pkl file not found error")Thanks.
gan-debiasing/Models/attr_classifier.py
Line 43 in 505f085
Respected sir
While implementing the repository I was unable to find the related text file with the requirements/dependencies for the code to run. I request you to please guide me.
Edit: Do you recommend installing the dependencies given in the reference coursera google colab notebook
# Import libraries.
import numpy as np
np.random.seed(123)
from sklearn import svm
import matplotlib.pyplot as plt
deo, deo_std = utils.bootstrap_deo(val_targets[:, 1], val_targets[:, 0], val_pred)
bootstrap_deo doesnt have a default value for repeats
Hi!
Could you please add a list of estimated run times for each of the operations/training steps in the list?
Many thanks,
Dominik
In my opinion, this function is for building a dataset to integrate both fake images and original training data.
The code part that I have questions about is
labeldata = pickle.load(open(fake_params['attr_path'], 'rb')) labeldata = np.tile(labeldata, 2)
The range of labeldata is (0, 175000) before np.tile. However, when assigning the label to fake images, you assign the samples in (15000, 175000) as lables in (0, 160000). Is it a misalignment?
Or there must be some unknown extra preprocessing in "all_{}_scores.pkl".
Should I process it as the domdata?
Hello,
I was trying to run your code and I have been struggling with an error for the past few hours.
This is the error that I met when I tried to run python main.py --experiment model:
FileNotFoundError: [Errno 2] No such file or directory: 'data/fake_images/Smiling_scores.pkl'
I have followed your steps on the readme and I haven't found out where in the code this file is being created.
Full output:
{'experiment': 'model', 'experiment_name': '', 'real_data_dir': 'data/celeba', 'fake_data_dir_orig': 'data/fake_images/AllGenImages/', 'fake_data_dir_new': '', 'fake_scores_target': '', 'fake_scores_protected': '', 'cuda': True, 'random_seed': 0, 'attribute': 31, 'protected_attribute': 20, 'test_mode': False, 'num_train': 160000, 'number': 0, 'device': device(type='cuda'), 'dtype': torch.float32, 'print_freq': 100, 'total_epochs': 20, 'save_folder': 'record/model\Smiling', 'optimizer_setting': {'optimizer': <class 'torch.optim.adam.Adam'>, 'lr': 0.0001, 'weight_decay': 0}, 'dropout': 0.5, 'data_setting': {'real_params': {'path': 'data/celeba', 'attribute': 31, 'protected_attribute': 20, 'number': 0}, 'fake_params': {'path_new': 'data/fake_images/Smiling/', 'path_orig': 'data/fake_images/AllGenImages/', 'attr_path': 'data/fake_images/Smiling_scores.pkl', 'dom_path': 'data/fake_images/all_Male_scores.pkl', 'range_orig_image': (15000, 175000), 'range_orig_label': (160000, 320000), 'range_new': (0, 160000)}, 'augment': True, 'params_train': {'batch_size': 32, 'shuffle': True, 'num_workers': 0}, 'params_val': {'batch_size': 64, 'shuffle': False, 'num_workers': 0}}}
Traceback (most recent call last):
File "main.py", line 197, in
main(opt)
File "main.py", line 63, in main
train = create_dataset_all(
File "load_data.py", line 60, in create_dataset_all
labeldata = pickle.load(open(fake_params['attr_path'], 'rb'))
FileNotFoundError: [Errno 2] No such file or directory: 'data/fake_images/Smiling_scores.pkl'
Hi ,
Baseline model fairness was not dropped for the Male is protected attribute and Smiling is the target attribute.
I've trained the standard attribute classifier using the below command python main.py --experiment baseline
As per the code, The baseline model is trained on the
Dataset:
CelebA training dataset X with 162,770 images,
Hyperparameters:
Binary cross-entropy loss for 20 epochs with a batch size of 32, and using Adam optimizer with a learning rate of 1e-4.
Target attribute: Smilling (attribute 31)
Protected attribute: Male (attribute 20)
Fairness metrics for Baseline model :
Training epoch 19: [5001|5087], loss:0.07021914422512054
Avg precision all = 0.9827746330855125
Validation results:
AP : {:.1f} +- {:.1f} 98.52633278341692 0.13046754108938044
DEO : {:.1f} +- {:.1f} 3.3444695945977543 1.1048136540690787
BA : {:.1f} +- {:.1f} 0.06943005310537514 0.3600587245308783
KL : {:.1f} +- {:.1f} 0.01865371993659408 0.03730743987318816
Test results:
AP : {:.1f} +- {:.1f} 98.43127724632386 0.1274417440206799
DEO : {:.1f} +- {:.1f} 2.7732756124918376 1.1788377126697231
BA : {:.1f} +- {:.1f} -0.5825026659862016 0.4052127065723986
KL : {:.1f} +- {:.1f} 0.020252475423118695 0.04050495084623739
if you observe the above fairness matrics (AP, DEO, BA, KL) looks fine for the baseline model,
but the result is shown in the paper for the baseline model is different (DEO, BA, KL is high).
As per my understanding the standard classifier fairness metrics: DEO, BA, KL should be high.
do I need to change anything to train the standard attribute classifier to reproduce the paper results(fairness)?
Please let me know whether I'm doing any mistakes.
Expecting your response.
Hi,
get_scores.py
takes in a command line argument with the attribute. You need to (1) run main.py
for both the protected and target attributes, and (2) run get_scores.py
with both the protected and target attributes.
Originally posted by @vramaswamy94 in #8 (comment)
can you please provide the steps on how to run main.py
for both target and protected attributes (commands to run the main.py file ).
Thank you in advance.
Hello, I was trying to run your code and I have been struggling with an error for the past few hours. This is the error that I met when I tried to run python main.py --experiment model. I saw your previous response, but I have a confusion. You said that running get_scores.py and change the path to the location where the scores are stored could solve it. Should I need to generate scores for the newly generated images "data/fake_images/Straight_Hair/"? If I only modify the out_file parameter, it seems like get_scores.py is still hallucinating scores for the original images "data/fake_images/AllGenImages/" instead of the newly generated ones. What command should I need to run to generate the final Straight_Hair_scores.pkl file that I want? It seems like python get_scores.py --attribute 32 --out_file data/fake_images/Straight_Hair_scores.pkl
doesn't achieve the desired outcome I'm aiming for.
I have faithfully followed the README instructions to reproduce the experiment, selecting the target attribute as "StraightHair". However, the baseline and final training results are as follows:
This is the improvement over baseline:
When comparing them to the results mentioned in the paper, I notice a significant difference in terms of both metrics and improvement levels:
How can I address this issue?
in main.py line 139
f1_score, f1_thresh = utils.get_threshold(val_targets[:, 0], val_scores)
Line 139 in cdaa0f3
However, in utils.py line 164
return best_t, best_acc
Line 164 in cdaa0f3
it returns threshold first, instead of score.
Is this a mistake?
because f1_thresh is needed to hallucinate labels in get_scores.py
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.