Code Monkey home page Code Monkey logo

boxsnake's Introduction

BoxSnake: Polygonal Instance Segmentation with Box Supervision

Rui Yang, Lin Song, Yixiao Ge, Xiu Li

BoxSnake is an end-to-end training technique to achieve effective polygonal instance segmentation using only box annotations. It consists of two loss functions: (1) a point-based unary loss that constrains the bounding box of predicted polygons to achieve coarse-grained segmentation; and (2) a distance-aware pairwise loss that encourages the predicted polygons to fit the object boundaries.

Intro Arxiv Paper | [Video Demo]

Installation


To install Detectron2, torch 1.9.0+ for BoxSnake:

pip install -r requirements.txt

BoxSnake also uses the deformable attention modules introduced in Deformable-DETR and the differentiable rasterizer introduced in BoundaryFormer. Please build them on your system:

bash scripts/auto_build.sh

or

cd ./modeling/layers/deform_attn
sh ./make.sh
cd ./modeling/layers/diff_ras
python setup.py build install

Model Zoo


COCO

Arch Backbone lr
sched
mask
AP
mask
AP
Download
RCNN R50-FPN 1X 31.1 config weights
RCNN R50-FPN 2X 31.6 config weights
RCNN R101-FPN 1X 31.6 config weights
RCNN R101-FPN 2X 32.1 config weights
RCNN Swin-B-FPN 1X 38.3 config weights
RCNN Swin-L-FPN 1X 38.9 config weights

mask AP is the result on validation set.

Cityscapes

Arch Backbone lr
sched
mask
AP
config Download
RCNN R50-FPN 24K iter 26.3 config weights

Getting Start


We use the COCO dataset and Cityscapes dataset. Please following here to prepare them.

If you would like to use swin transformer backbone, please download swin weights from here and convert them to pkl format:

python tools/convert-pretrained-model-to-d2.py ${your_swin_pretrained.pth} ${yout_swin_pretrained.pkl}

Training

To train on COCO dataset using the R50 backbone at a 1X schedule:

# 8 gpus
python train_net.py --num-gpus 8 --config-file configs/COCO-InstanceSegmentation/BoxSnake_RCNN/boxsnake_rcnn_R_50_FPN_1x.yaml

You can also run below code:

bash scripts/auto_run.sh $CONFIG  # your config

Inference

To inference on COCO validation set using trained weights:

# 8 gpus
python train_net.py --num-gpus 8 --config-file configs/COCO-InstanceSegmentation/BoxSnake_RCNN/boxsnake_rcnn_R_50_FPN_1x.yaml
 --eval-only MODEL.WEIGHTS ${your/checkpoints/boxsnake_rcnn_R_50_FPN_coco_1x.pth}

Inference on a single image using trained weights:

python demo/demo.py --config-file configs/COCO-InstanceSegmentation/BoxSnake_RCNN/boxsnake_rcnn_R_50_FPN_1x.yaml --input demo/demo.jpg --output ${/your/visualized/dir} --confidence-threshold 0.5 --opts MODEL.WEIGHTS ${your/checkpoints/boxsnake_rcnn_R_50_FPN_coco_1x.pth}

Others


BoxSnake is inspired by traditional levelset (including boxlevelset) and GVF methods, and you can check below links to learn them:

Some geometric knowledge may help readers to understand the BoxSnake better:

Acknowledgement

If you find BoxSnake helpful, please cite:

@misc{BoxSnake,
      title={BoxSnake: Polygonal Instance Segmentation with Box Supervision}, 
      author={Rui Yang and Lin Song and Yixiao Ge and Xiu Li},
      year={2023},
      eprint={2303.11630},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

boxsnake's People

Contributors

yangr116 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.