Code Monkey home page Code Monkey logo

cristianopatricio / concept-based-interpretability-vlm Goto Github PK

View Code? Open in Web Editor NEW
2.0 1.0 0.0 21.67 MB

Code for the paper "Towards Concept-based Interpretability of Skin Lesion Diagnosis using Vision-Language Models", ISBI 2024.

Jupyter Notebook 89.84% Python 10.16%
clip concept-based-explanations deep-learning explainable-ai interpretability medical-imaging melanoma-diagnosis skin-lesion-classification visual-language-models ieee-isbi

concept-based-interpretability-vlm's Introduction

Towards Concept-based Interpretability of Skin Lesion Diagnosis using Vision-Language Models

Paper accepted at IEEE International Symposium on Biomedical Imaging - ISBI 2024.

Towards Concept-based Interpretability of Skin Lesion Diagnosis using Vision-Language Models


Citation

If you use this repository, please cite:

@article{patricio2023towards,
  title={Towards Concept-based Interpretability of Skin Lesion Diagnosis using Vision-Language Models},
  author={Patr{\'\i}cio, Cristiano and Teixeira, Lu{\'\i}s F and Neves, Jo{\~a}o C},
  journal={arXiv preprint arXiv:2311.14339},
  year={2023}
}

1. Download data

Note: You should mask out the original images of each dataset with the available masks (download masks here) in order to reproduce the results of the paper.

2. Training

2.1 Prepare conda environment

Create a new conda environment with the required libraries contained in requirements.txt file:

conda create --name cbi-vlm --file requirements.txt

2.2 Fine-Tune CLIP on Derm7pt and ISIC 2018

  • Use the configuration file (CLIP/modules/config.py) to adjust settings for training:

    • clip_model: choose between {ViT-B/32, ViT-B/16, RN50, RN101, ViT-L/14, RN50x16}
    • seed: choose between {0, 42, 84, 168}
    • dataset: choose between {'derm7pt', 'ISIC_2018'}
    • batch_size: default 32
    • image_embedding: set accordingly to dim of each CLIP model
    • text_embedding: set accordingly to dim of each CLIP model
    • projection_dim: set accordingly to your preference
    • path_to_model: path of the trained model

    See suplementary document for more details on the architectures chosen.

  • Change image file paths according to your own file paths in extract_image_embeddings function [CLIP/modules/utils.py].

  • Run train script [CLIP/train.py]:

python train.py
  • Run inference script [CLIP/inference.py] (Extract image & text embeddings used for evaluation):
 python inference.py

3. Evaluation

All required dataset splits are available under /data folder.

3.1. PH $^2$ dataset

  • $k$-fold evaluation:
# CLIP - Baseline
python CLIP/scr_k_fold_evaluate_PH2_Baseline.py

# CLIP - CBM
python CLIP/scr_k_fold_evaluate_PH2_CBM.py

# CLIP - GPT-CBM
python CLIP/scr_k_fold_evaluate_PH2_GPT-CBM.py

# MONET - Baseline
python MONET/scr_k_fold_evaluate_PH2_Baseline.py

# MONET - CBM
python MONET/scr_k_fold_evaluate_PH2_CBM.py

# MONET - GPT-CBM
python MONET/scr_k_fold_evaluate_PH2_GPT-CBM.py

# Each of the above scripts will generate a numpy file with the results. Read the file to analyze the results.
  • Individual evaluation (jupyter notebooks):
# CLIP - Baseline
CLIP/scr_Baseline_CLIP_PH2.ipynb

# CLIP - CBM
CLIP/scr_CBM_CLIP_PH2.ipynb

# CLIP - GPT-CBM
CLIP/scr_GPT-CBM_CLIP_PH2.ipynb

# MONET - Baseline
MONET/scr_Baseline_MONET.ipynb

# MONET - CBM
MONET/scr_CBM_MONET.ipynb

# MONET GPT-CBM
MONET/scr_GPT-CBM_MONET.ipynb

3.2. Derm7pt dataset

  • Evaluation over four runs:
# CLIP - Baseline
python CLIP/scr_evaluate_derm7pt_Baseline.py

# CLIP - CBM
python CLIP/scr_evaluate_derm7pt_CBM.py

# CLIP - GPT-CBM
python CLIP/scr_evaluate_derm7pt_GPT_CBM.py


# Each of the above scripts will generate a numpy file with the results. Read the file to analyze the results.
  • Individual evaluation (jupyter notebooks):
# CLIP - Baseline
CLIP/scr_Baseline_CLIP-derm7pt.ipynb

# CLIP - CBM
CLIP/scr_CBM_CLIP-derm7pt.ipynb

# CLIP - GPT-CBM
CLIP/scr_GPT-CBM_CLIP-derm7pt.ipynb

# MONET - Baseline
MONET/scr_Baseline_MONET.ipynb

# MONET - CBM
MONET/scr_CBM_MONET.ipynb

# MONET GPT-CBM
MONET/scr_GPT-CBM_MONET.ipynb

3.3. ISIC 2018 dataset

  • Evaluation over four runs:
# CLIP - Baseline
python CLIP/scr_evaluate_ISIC_2018_Baseline.py

# CLIP - CBM
python CLIP/scr_evaluate_ISIC_2018_CBM.py

# CLIP - GPT-CBM
python CLIP/scr_evaluate_ISIC_2018_GPT_CBM.py


# Each of the above scripts will generate a numpy file with the results. Read the file to analyze the results.
  • Individual evaluation (jupyter notebooks):
# CLIP - Baseline
CLIP/scr_Baseline_CLIP-ISIC_2018.ipynb

# CLIP - CBM
CLIP/scr_CBM_CLIP-ISIC_2018.ipynb

# CLIP - GPT-CBM
CLIP/scr_GPT-CBM_CLIP-ISIC_2018.ipynb

# MONET - Baseline
MONET/scr_Baseline_MONET-ISIC_2018.ipynb

# MONET - CBM
MONET/scr_CBM_MONET.ipynb

# MONET GPT-CBM
MONET/scr_GPT-CBM_MONET.ipynb

[Last update: Mon Feb 19 03:41:45 PM WET 2024]

concept-based-interpretability-vlm's People

Contributors

cristianopatricio avatar

Stargazers

 avatar  avatar

Watchers

 avatar

concept-based-interpretability-vlm's Issues

Zero-shot inference

Greetings,
Great work!

Can you kindly explain how to use the weights that you provided for zero shot inference?
Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.