Code Monkey home page Code Monkey logo

causal-unsupervised-segmentation's Introduction

PWC

PWC

PWC

PWC

PWC

This is pytorch implementation code for realizing the technical part of CAusal Unsupervised Semantic sEgmentation (CAUSE) to improve performance of unsupervised semantic segmentation. This code is further developed by two baseline codes of HP: Leveraging Hidden Positives for Unsupervised Semantic Segmentation accpeted in CVPR 2023 and STEGO: Unsupervised Semantic Segmentation by Distilling Feature Correspondences accepted in ICLR 2022.


You can see the following bundle of images in Appendix. Further, we explain concrete implementation beyond the description of the main paper.

Figure 1. Visual comparison of USS for COCO-stuff. Note that, in contrast to true labels, baseline frameworks fail to achieve targeted level of granularity, while CAUSE successfully clusters person, sports, vehicle, etc.

Figure 2. Qualitative comparison of unsupervised semantic segmentation for Cityscapes.

Figure 3. Log scale of mIoU results for each categories in COCO-Stuff (Black: Thing / Gray: Stuff )


๐Ÿš€ Download Visual Quality, Seg Head Parameter, and Concept ClusterBook of CAUSE

You can download the checkpoint files including CAUSE-trained parameters based on DINO, DINOv2, iBOT, MSN, MAE in self-supervised vision transformer framework. If you want to download the pretrained models of DINO in various structures the following CAUSE uses, you can download them in the following links:


Dataset Method Baseline mIoU(%) pAcc(%) Visual Quality Seg Head Parameter Concept ClusterBook
COCO-Stuff DINO+CAUSE-MLP ViT-S/8 27.9 66.8 [link] [link] [link]
COCO-Stuff DINO+CAUSE-TR ViT-S/8 32.4 69.6 [link] [link] [link]
COCO-Stuff DINO+CAUSE-MLP ViT-S/16 25.9 66.3 [link] [link] [link]
COCO-Stuff DINO+CAUSE-TR ViT-S/16 33.1 70.4 [link] [link] [link]
COCO-Stuff DINO+CAUSE-MLP ViT-B/8 34.3 72.8 [link] [link] [link]
COCO-Stuff DINO+CAUSE-TR ViT-B/8 41.9 74.9 [link] [link] [link]
COCO-Stuff DINOv2+CAUSE-TR ViT-B/14 45.3 78.0 [link] [link] [link]
COCO-Stuff iBOT+CAUSE-TR ViT-B/16 39.5 73.8 [link] [link] [link]
COCO-Stuff MSN+CAUSE-TR ViT-S/16 34.1 72.1 [link] [link] [link]
COCO-Stuff MAE+CAUSE-TR ViT-B/16 21.5 59.1 [link] [link] [link]

Dataset Method Baseline mIoU(%) pAcc(%) Visual Quality Seg Head Parameter Concept ClusterBook
Cityscapes DINO+CAUSE-MLP ViT-S/8 21.7 87.7 [link] [link] [link]
Cityscapes DINO+CAUSE-TR ViT-S/8 24.6 89.4 [link] [link] [link]
Cityscapes DINO+CAUSE-MLP ViT-B/8 25.7 90.3 [link] [link] [link]
Cityscapes DINO+CAUSE-TR ViT-B/8 28.0 90.8 [link] [link] [link]
Cityscapes DINOv2+CAUSE-TR ViT-B/14 29.9 89.8 [link] [link] [link]
Cityscapes iBOT+CAUSE-TR ViT-B/16 23.0 89.1 [link] [link] [link]
Cityscapes MSN+CAUSE-TR ViT-S/16 21.2 89.1 [link] [link] [link]
Cityscapes MAE+CAUSE-TR ViT-B/16 12.5 82.0 [link] [link] [link]

Dataset Method Baseline mIoU(%) pAcc(%) Visual Quality Seg Head Parameter Concept ClusterBook
Pascal VOC DINO+CAUSE-MLP ViT-S/8 46.0 - [link] [link] [link]
Pascal VOC DINO+CAUSE-TR ViT-S/8 50.0 - [link] [link] [link]
Pascal VOC DINO+CAUSE-MLP ViT-B/8 47.9 - [link] [link] [link]
Pascal VOC DINO+CAUSE-TR ViT-B/8 53.3 - [link] [link] [link]
Pascal VOC DINOv2+CAUSE-TR ViT-B/14 53.2 91.5 [link] [link] [link]
Pascal VOC iBOT+CAUSE-TR ViT-B/16 53.4 89.6 [link] [link] [link]
Pascal VOC MSN+CAUSE-TR ViT-S/16 30.2 84.2 [link] [link] [link]
Pascal VOC MAE+CAUSE-TR ViT-B/16 25.8 83.7 [link] [link] [link]

Dataset Method Baseline mIoU(%) pAcc(%) Visual Quality Seg Head Parameter Concept ClusterBook
COCO-81 DINO+CAUSE-MLP ViT-S/8 19.1 78.8 [link] [link] [link]
COCO-81 DINO+CAUSE-TR ViT-S/8 21.2 75.2 [link] [link] [link]
COCO-171 DINO+CAUSE-MLP ViT-S/8 10.6 44.9 [link] [link] [link]
COCO-171 DINO+CAUSE-TR ViT-S/8 15.2 46.6 [link] [link] [link]

๐Ÿค– CAUSE Framework (Top-Level File Directory Layout)

.
โ”œโ”€โ”€ loader
โ”‚   โ”œโ”€โ”€ netloader.py                # Self-Supervised Pretrained Model Loader & Segmentation Head Loader
โ”‚   โ””โ”€โ”€ dataloader.py               # Dataloader Thanks to STEGO [ICLR 2022]
โ”‚
โ”œโ”€โ”€ models                          # Model Design of Self-Supervised Pretrained: [DINO/DINOv2/iBOT/MAE/MSN]
โ”‚   โ”œโ”€โ”€ dinomaevit.py               # ViT Structure of DINO and MAE
โ”‚   โ”œโ”€โ”€ dinov2vit.py                # ViT Structure of DINOv2
โ”‚   โ”œโ”€โ”€ ibotvit.py                  # ViT Structure of iBOT
โ”‚   โ””โ”€โ”€ msnvit.py                   # ViT Structure of MSN
โ”‚
โ”œโ”€โ”€ modules                         # Segmentation Head and Its Necessary Function
โ”‚   โ””โ”€โ”€ segment_module.py           # [Including Tools with Generating Concept Book and Contrastive Learning
โ”‚   โ””โ”€โ”€ segment.py                  # [MLP & TR] Including Tools with Generating Concept Book and Contrastive Learning
โ”‚
โ”œโ”€โ”€ utils
โ”‚   โ””โ”€โ”€ utils.py                    # Utility for auxiliary tools
โ”‚
โ”œโ”€โ”€ train_modularity.py             # (STEP 1) [MLP & TR] Generating Concept Cluster Book as a Mediator
โ”‚
โ”œโ”€โ”€ train_front_door_mlp.py         # (STEP 2) [MLP] Frontdoor Adjustment through Unsupervised Semantic Segmentation
โ”œโ”€โ”€ fine_tuning_mlp.py              # (STEP 3) [MLP] Fine-Tuning Cluster Probe
โ”‚
โ”œโ”€โ”€ train_front_door_tr.py          # (STEP 2) [TR] Frontdoor Adjustment through Unsupervised Semantic Segmentation
โ”œโ”€โ”€ fine_tuning_tr.py               # (STEP 3) [TR] Fine-Tuning Cluster Probe
โ”‚
โ”œโ”€โ”€ test_mlp.py                     # [MLP] Evaluating Unsupervised Semantic Segmantation Performance (Post-Processing)
โ”œโ”€โ”€ test_tr.py                      # [TR] Evaluating Unsupervised Semantic Segmantation Performance (Post-Processing)
โ”‚
โ”œโ”€โ”€ requirements.txt
โ””โ”€โ”€ README.md

๐Ÿ“Š How to Run CAUSE?

For the first, we should generate the cropped dataset by following STEGO in ICLR 2022.

python crop_dataset.py --dataset "cocostuff27" --crop_type "five"
python crop_dataset.py --dataset "cityscapes"  --crop_type "five"
python crop_dataset.py --dataset "pascalvoc"   --crop_type "super"
python crop_dataset.py --dataset "cooc81"      --crop_type "double"
python crop_dataset.py --dataset "cooc171"     --crop_type "double"

And then,

bash run # All of the following three steps integrated

In this shell script file, you can see the following code

#!/bin/bash
######################################
# [OPTION] DATASET
# cocostuff27
dataset="cocostuff27"
#############

######################################
# [OPTION] STRUCTURE
structure="TR"
######################################

######################################
# [OPTION] Self-Supervised Method
ckpt="checkpoint/dino_vit_base_8.pth"
######################################

######################################
# GPU and PORT
if [ "$structure" = "MLP" ]
then
    train_gpu="0,1,2,3"
elif [ "$structure" = "TR" ]
then
    train_gpu="4,5,6,7"
fi

# Non-Changeable Variable
test_gpu="${train_gpu:0}"
port=$(($RANDOM%800+1200))
######################################

######################################
# [STEP1] MEDIATOR
python train_mediator.py --dataset $dataset --ckpt $ckpt --gpu $train_gpu --port $port
######################################

######################################
# [STEP2] CAUSE
if [ "$structure" = "MLP" ]
then 
    python train_front_door_mlp.py --dataset $dataset --ckpt $ckpt --gpu $train_gpu --port $port
    python fine_tuning_mlp.py --dataset $dataset --ckpt $ckpt --gpu $train_gpu --port $port
elif [ "$structure" = "TR" ]
then
    python train_front_door_tr.py --dataset $dataset --ckpt $ckpt --gpu $train_gpu --port $port 
    python fine_tuning_tr.py --dataset $dataset --ckpt $ckpt --gpu $train_gpu --port $port
fi
######################################

######################################
# TEST
if [ "$structure" = "MLP" ]
then 
    python test_mlp.py --dataset $dataset --ckpt $ckpt --gpu $test_gpu
elif [ "$structure" = "TR" ]
then 
    python test_tr.py --dataset $dataset --ckpt $ckpt --gpu $test_gpu
fi
######################################

1. Training CAUSE

(STEP 1): Generating Mediator based on Modularity

python train_mediator.py # DINO/DINOv2/iBOT/MSN/MAE

(STEP 2): Frontdoor Adjustment through Contrastive Learning

python train_front_door_mlp.py # CAUSE-MLP

# or

python train_front_door_tr.py # CAUSE-TR

(STEP 3): Technical STEP: Fine-Tuning Cluster Probe

python fine_tuning_mlp.py # CAUSE-MLP

# or

python fine_tuning_tr.py # CAUSE-TR

2. Testing CAUSE

python test_mlp.py # CAUSE-MLP

# or

python test_tr.py # CAUSE-TR

๐Ÿ’ก Environment Settings

  • Creating Virtual Environment by Anaconda

conda create -y -n neurips python=3.9

  • Installing PyTorch Package in Virtual Envrionment

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

  • Installing Pip Package

pip install -r requirements.txt

  • [Optional] Removing Conda and PIP Cache if Conda and PIP have been locked by unknown reasons

conda clean -a && pip cache purge


๐Ÿ… Download Datasets

Available Datasets

Note: Pascal VOC is not necessary to download because dataloader will automatically download in your own dataset path

Try the following scripts

If the above do not work, then download azcopy and follow the below scripts

Unzip Datasets

unzip cocostuff.zip && unzip cityscapes.zip

causal-unsupervised-segmentation's People

Contributors

byungkwanlee avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.