marcelrobeer / contrastiveexplanation Goto Github PK

Contrastive Explanation (Foil Trees), developed at TNO/Utrecht University

License: BSD 3-Clause "New" or "Revised" License

Python 73.63% Jupyter Notebook 26.37%

interpretability machine-learning contrastive-explanations foil model-agnostic explainable-ai

contrastiveexplanation's Introduction

Welkom! 🇳🇱

🐻 My name is Marcel Robeer, and I am currently pursuing a PhD in Explainable Artificial Intelligence (XAI) at Utrecht University!

🤖 My thesis projects and scientific research projects have resulted in several open-source Python packages:

Explabox: {Explore | Examine | Explain | Expose } your AI model with the explabox!
GlobalCausalAnalysis: Explaining Model Behavior with Global Causal Analysis (give a causal overview of how aspects such as task-related features, fairness and robustness relate to black-box model behavior) [xAI 2023 paper].
text_explainability: A generic explainability architecture for explaining text machine learning models.
text_sensitivity: Extension of text_explainability for sensitivity testing (robustness & fairness).
CounterfactualGAN: Generating realistic natural language counterfactuals for classifiers and regressors, without requiring explainee intervention [EMNLP 2021 paper].
ContrastiveExplanation: Contrastive and counterfactual explanations for machine learning with Foil Trees [WHI 2018 paper].
VisualNarrator: Turns user stories into a conceptual model containing entities and relationships [RE 2016 paper].

💻 Check out marcelrobeer.github.io for a full overview. See you there!

contrastiveexplanation's People

Contributors

Stargazers

Watchers

Forkers

fgsilva calebcc stjordanis dobryk15 m4laclypse thaole25

contrastiveexplanation's Issues

lightgbm categorical feature support

Does the model supports categorical feature types for lgbm?
I got an error when running with specified categorical features.

I wanted to ask if you considered to create a DOI for the Contrastive Explanation. This allows researchers to reference a version of the package with ease and can be done with Zenodo for example. There is also great integration between Zenodo, which creates a new DOI and a persistent copy of the repository for each release, and Github. You can find instructions on how to create a DOI here from official Github docs: https://docs.github.com/en/repositories/archiving-a-github-repository/referencing-and-citing-content
This step helps with visibility of this repository and therefore making your research software more used.

On a similar note, it also helps researchers to know how to correctly cite the software. I see that you already added this in the readme but Github also offers the citation file format which also shows up on the top right if used:

You can find more information about it here: https://citation-file-format.github.io/

I would be happy to help with this if you have any questions.

Explanations of Clustering algorithms

Hi, I was wondering if the Clustering algorithm is supported or not. I see you mentioned it in the README but looking at the code, I can't find it anywhere.
Thanks :)

Changing output every run

Hi,

When I run the exp.explain_instance_domain(model.predict_proba, sample) , I get different output everytime.
After every run, a different column is given as output.

Is there a way I can get constant results?

Trying to understand output

I have been trying your approach for a regression problem with categorical features. I receive an explanation in form:

The model predicted 123 because sales < 1445 and not month ( dummy example)

month is a categorical variable with values "1", "2", .. "12".

What does it mean " and not month" then ?

Thank you

Always getting the warning "UserWarning: Could not find a difference between fact...", with blank explenations - for any dataset, and every sample.

I am trying to exactly recreate the example from the README for the Iris-dataset. Unforunately, when running .explain_instance_domain(model.predict_proba, sample) I get the following output:

[F] Picked foil "1" using foil selection strategy "second"
[D] Obtaining neighborhood data
C:\Users\dsemkoandrosenko\contrastive_explanation\contrastive_explanation.py:264: UserWarning: Could not find a difference between fact "setosa" and foil "versicolor"
warnings.warn(f'Could not find a difference between fact '
"The model predicted 'setosa' instead of 'versicolor' because ''"

I get the same issue with every single other sample, and even every other dataset I try. What could be the issue?

Versions:
Windows 10
Python: 3.7.4
Scikit-Learn: 0.21.3
Numpy: 1.18.2

Specify Features

Hi, thank you for the great package. I would like to know is there a way to specify which features to change? For example I would like to see what I need to change only for specific features?

Thank you

ValueError: blocks must be 2-D

Running the example notebook (1.3a block) I get the following error:

ValueError: blocks must be 2-D

Any idea how to fix it?

Not working for multi-valued categorical features

Does the current implementation support only binary-valued categorical features?

Because I tried with the adult income dataset which has many multi-value categorical and continuous features (https://archive.ics.uci.edu/ml/datasets/adult)
and got output like these:

"The model predicted '<=50k' instead of '>50k' because 'hours_per_week <= 42.832 and not occupation and age <= 34.95 and not education and hours_per_week <= 57.892'"

Here, education and occupation are not binary features - they have many levels.

Tensorflow or Pytorch Support

This is a great package. Do you plan to support Tensorflow or Pytorch in the future?

TypeError in the example notebook

running the example notebook (Contrastive explanation - example usage), the line exp.explain_instance_domain(model.predict_proba, sample)
gives the following error: