The cem from mateoespinosa

Computing CAS for non-concept models

Thanks for the great paper and repo!

I have a question on how to compute the concept alignment score (CAS) for non-concept-based models.
In your paper, you report CAS for "No concepts" models, but it is not clear on which embeddings the score should be computed.
Should it be computed on the latent code h ?

Unexpected results while training CBM on CelebA

Hey Mateo,

Thanks for your work (I really admire your papers on concept-based models 😁) and releasing your code!

I am trying to train Concept Bottleneck Models using this repository on CelebA. Following your approach, I am using 8 concepts, but I don't have any hidden concepts (so, all 8 concepts are used). I tried training independent and sequential concept bottleneck models. This is the config:

trials: 1
results_dir: /mnt/qb/work/bethge/bkr046/CEM/results/CelebA

dataset: celeba
root_dir: /mnt/qb/work/bethge/bkr046/DATASETS/celeba_torchvision #cem/data/
image_size: 64
num_classes: 1000
batch_size: 512
use_imbalance: True
use_binary_vector_class: True
num_concepts: 8
label_binary_width: 1
label_dataset_subsample: 12
#num_hidden_concepts: 0
selected_concepts: False
num_workers: 8
sampling_percent: 1
test_subsampling: 1

intervention_freq: 1
intervention_batch_size: 1024
intervention_policies:
- "group_random_no_prior"

competence_levels: [1, 0]
incompetence_intervention_policies:
- "group_random_no_prior"

skip_repr_evaluation: True

shared_params:
top_k_accuracy: [3, 5, 10]
save_model: True
max_epochs: 200
patience: 15
emb_size: 16
extra_dims: 0
concept_loss_weight: 1
learning_rate: 0.005
weight_decay: 0.000004
weight_loss: False
c_extractor_arch: resnet34
optimizer: sgd
early_stopping_monitor: val_loss
early_stopping_mode: min
early_stopping_delta: 0.0
momentum: 0.9
sigmoidal_prob: False

runs:
- architecture: 'SequentialConceptBottleneckModel'
extra_name: "NoInterventionInTrainingNoHiddenConcepts"
sigmoidal_embedding: False
concat_prob: False
embedding_activation: "leakyrelu"
bool: False
extra_dims: 0
sigmoidal_extra_capacity: False
sigmoidal_prob: True
training_intervention_prob: 0

While doing this, I noticed a few unexpected things:

Even with 100% interventions, the accuracy of the independent model is less than 50%. Since all concepts are visible, and it's an independent model, the c2y component would have seen all possible concept combinations during training, and therefore should be able to perform well when all ground-truth concepts are provided during interventions.

+---------------------------------------------------------------------------+-----------------+------------------+-----------------+-----------------+-----------------+-----------------+-----------------+
| Method | Task Accuracy | Concept Accuracy | Concept AUC | 25% Int Acc | 50% Int Acc | 75% Int Acc | 100% Int Acc |
+---------------------------------------------------------------------------+-----------------+------------------+-----------------+-----------------+-----------------+-----------------+-----------------+
| IndependentConceptBottleneckModel | 0.2743 ± 0.0000 | 0.8237 ± 0.0000 | 0.8163 ± 0.0000 | 0.3009 ± 0.0000 | 0.3229 ± 0.0000 | 0.3457 ± 0.0000 | 0.3711 ± 0.0000 |
| SequentialConceptBottleneckModel | 0.2784 ± 0.0000 | 0.8237 ± 0.0000 | 0.8163 ± 0.0000 | 0.3107 ± 0.0000 | 0.3409 ± 0.0000 | 0.3584 ± 0.0000 | 0.3711 ± 0.0000 |
+---------------------------------------------------------------------------+-----------------+------------------+-----------------+-----------------+-----------------+-----------------+-----------------+

The training accuracy of the independent model is also pretty low, which is concerning. This is from the 49th epoch (which looks like the last epoch to me):

Epoch 49: 86%|████████▌ | 24/28 [00:01<00:00, 14.65it/s, loss=3.13, y_accuracy=0.381, y_top_3_accuracy=0.548, y_top_5_accuracy=0.619, y_top_10_accuracy=0.714, val_y_accuracy=0.384, val_y_top_3_accuracy=0.557, val_y_top_5_accuracy=0.629, val_y_top_10_accuracy=0.697]

The number of classes in the dataset is 230 instead of 256. I guess that is because some of the 256 possible classes never appear in the dataset. Did you see this as well?

Do you think these observations are expected? If yes, it would be really helpful if you could provide some intuition as to why.

TypeError: init() got an unexpected keyword argument 'task_class_weights'

When executing the command "python experiments/run_experiments.py dot -o dot_results/"，there is a error.

Error Info:
File "D:\0develop\miniconda3\envs\CEM_NeurIPS\lib\site-packages\cem-1.0.0-py3.8.egg\cem\train\training.py", line 151, in construct_sequential_models
TypeError: init() got an unexpected keyword argument 'task_class_weights'

This is after I successfully executed "$ python setup.py install" because cmd showed "Finished processing dependencies for cem==1.0.0".

SO, is there a code problem with the packed "cem-1.0.0-py3.8.egg" file that "$ python setup.py install" creates and how to solve this problem?

How to define the dataset?

Please help to set the example if I want to train in celebA dataset, thanks!

mateoespinosa / cem Goto Github PK

cem's People

Contributors

Stargazers

Watchers

Forkers

cem's Issues

Computing CAS for non-concept models

Unexpected results while training CBM on CelebA

TypeError: init() got an unexpected keyword argument 'task_class_weights'

How to define the dataset?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent