Code Monkey home page Code Monkey logo

ov-seg-mess's Introduction

Multi-domain Evaluation of Semantic Segmentation (MESS) with OVSeg

[Website (soon)] [arXiv (soon)] [GitHub]

This directory contains the code for the MESS evaluation of OVSeg. Please see the commits for our changes of the model.

Setup

Create a conda environment ovseg and install the required packages. See mess/README.md for details.

 bash mess/setup_env.sh

Prepare the datasets by following the instructions in mess/DATASETS.md. The ovseg env can be used for the dataset preparation. If you evaluate multiple models with MESS, you can change the dataset_dir argument and the DETECTRON2_DATASETS environment variable to a common directory (see mess/DATASETS.md and mess/eval.sh, e.g., ../mess_datasets).

Download the OVSeg weights (see https://github.com/facebookresearch/ov-seg/blob/main/GETTING_STARTED.md)

mkdir weights
conda activate ovseg
# Python code for downloading the weights from GDrive. Link: https://drive.google.com/file/d/1cn-ohxgXDrDfkzC1QdO-fi8IjbjXmgKy/view
python -c "import gdown; gdown.download(f'https://drive.google.com/uc?export=download&confirm=pbef&id=1cn-ohxgXDrDfkzC1QdO-fi8IjbjXmgKy', output='weights/ovseg_swinbase_vitL14_ft_mpt.pth')"

Evaluation

To evaluate the OVSeg model on the MESS dataset, run

bash mess/eval.sh

# for evaluation in the background:
nohup bash mess/eval.sh > eval.log &
tail -f eval.log 

For evaluating a single dataset, select the DATASET from mess/DATASETS.md, the DETECTRON2_DATASETS path, and run

conda activate ovseg
export DETECTRON2_DATASETS="datasets"
DATASET=<dataset_name>

# OVSeg large model
python train_net.py --num-gpus 1 --eval-only --config-file configs/ovseg_swinB_vitL_bs32_120k.yaml MODEL.WEIGHTS weights/ovseg_swinbase_vitL14_ft_mpt.pth OUTPUT_DIR output/OVSeg/$DATASET DATASETS.TEST \(\"$DATASET\",\)

--- Original OVSeg README.md ---

[OVSeg] Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP

This is the official PyTorch implementation of our paper:
Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
Feng Liang, Bichen Wu, Xiaoliang Dai, Kunpeng Li, Yinan Zhao, Hang Zhang, Peizhao Zhang, Peter Vajda, Diana Marculescu
Computer Vision and Pattern Recognition Conference (CVPR), 2023

[arXiv] [Project] [huggingface demo]

Installation

Please see installation guide.

Data Preparation

Please see datasets preparation.

Getting started

Please see getting started instruction.

Finetuning CLIP

Please see open clip training.

LICENSE

Shield: CC BY-NC 4.0

The majority of OVSeg is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

CC BY-NC 4.0

However portions of the project are under separate license terms: CLIP and ZSSEG are licensed under the MIT license; MaskFormer is licensed under the CC-BY-NC; openclip is licensed under the license at its repo.

Citing OVSeg ๐Ÿ™

If you use OVSeg in your research or wish to refer to the baseline results published in the paper, please use the following BibTeX entry.

@inproceedings{liang2023open,
  title={Open-vocabulary semantic segmentation with mask-adapted clip},
  author={Liang, Feng and Wu, Bichen and Dai, Xiaoliang and Li, Kunpeng and Zhao, Yinan and Zhang, Hang and Zhang, Peizhao and Vajda, Peter and Marculescu, Diana},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={7061--7070},
  year={2023}
}

ov-seg-mess's People

Contributors

jeff-liangf avatar blumenstiel avatar bichenwu09 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.