Code Monkey home page Code Monkey logo

rcrnet-pytorch's Introduction

RCRNet-Pytorch

This repository contains the PyTorch implementation for

Semi-Supervised Video Salient Object Detection Using Pseudo-Labels
Pengxiang Yan, Guanbin Li, Yuan Xie, Zhen Li, Chuan Wang, Tianshui Chen, Liang Lin
ICCV 2019 | [Project Page] | [Arxiv] | [CVF-Open-Access]

Usage

Requirements

This code is tested on Ubuntu 16.04, Python=3.6 (via Anaconda3), PyTorch=0.4.1, CUDA=9.0.

# Install PyTorch=0.4.1
$ conda install pytorch==0.4.1 torchvision==0.2.1 cuda90 -c pytorch

# Install other packages
$ pip install pyyaml==3.13 addict==2.2.0 tqdm==4.28.1 scipy==1.1.0

Datasets

Our proposed RCRNet is evaluated on three public benchmark VSOD datsets including VOS, DAVIS (version: 2016, 480p), and FBMS. Please orginaize the datasets according to config/datasets.yaml and put them in data/datasets. Or you can set argument --data to the path of the dataset folder.

Evaluation

Comparison with State-of-the-Art

comp_video_sota If you want to compare with our method:

Option 1: you can download the saliency maps predicted by our model from Google Drive / Baidu Pan (passwd: u079).

Option 2: Or you can use our trained model for inference. The weights of trained model are available at Google Drive / Baidu Pan (passwd: 6pi3). Then run the following command for inference.

# VOS
$ CUDA_VISIBLE_DEVICES=0 python inference.py --data data/datasets --dataset VOS --split test

# DAVIS
$ CUDA_VISIBLE_DEVICES=0 python inference.py --data data/datasets --dataset DAVIS --split val

# FBMS
$ CUDA_VISIBLE_DEVICES=0 python inference.py --data data/datasets --dataset FBMS --split test

Then, you can evaluate the saliency maps using your own evaluation code.

Training

If you want to train our proposed model from scratch (including using pseudo-labels), please refer to our paper and the training instruction carefully.

Citation

If you find this work helpful, please consider citing

@inproceedings{yan2019semi,
  title={Semi-Supervised Video Salient Object Detection Using Pseudo-Labels},
  author={Yan, Pengxiang and Li, Guanbin and Xie, Yuan and Li, Zhen and Wang, Chuan and Chen, Tianshui and Lin, Liang},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={7284--7293},
  year={2019}
}

Acknowledge

Thanks to the third-party libraries:

rcrnet-pytorch's People

Contributors

kinpzz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

rcrnet-pytorch's Issues

请教一下VOS数据集配置

前辈您好,我在复现您RCRNet的结果时,基本上没怎么改您的代码,用您训练好的伪标签生成器每5帧生成1帧伪标签,目前DAVIS和FBMS数据集上的性能都差不多,但是VOS 的 test 数据集的性能差了5-6个点。后来干脆不用伪标签,直接将伪标签生成器的frame_between_label_num设置为0,这样的话,相当于直接生成的是20%的真值。 我用这个训练,VOS test数据集的指标还是差了5-6个点。 但是用您提供的best_model直接跑inference,VOS的指标又是一样的。目前猜测是VOS文件配置问题?
DAVIS数据集配置:JPEGImages是帧间隔为1,伪标签文件夹里的标签(真值)帧间隔为5
FBMS数据集配置:JPEGImages是帧间隔不定,对应原始100%真值的图(一般间隔为20帧),伪标签文件夹里的标签(真值)帧间隔再 乘以 5
VOS数据集配置:JPEGImages是帧间隔为1,伪标签文件夹里的标签(真值)帧间隔为15 x 5 =75

我不太确定到底是哪里错了,能帮我对一下VOS数据集配置有问题吗

the Same paper as "Real-time Segmenting Human Portrait at Anywhere" ?

Real-time Segmenting Human Portrait at Anywhere
Ruifeng Yuan, Yuhao Cheng♯
, Yiqiang Yan, Haiyan Liu
Lenovo Research
Buidling1, No.10 Courtyard Xibeiwang East Road, Beijing, China

I found it use RCRNet too?

but "Real-time Segmenting Human Portrait at Anywhere" has no github traning code

关于半监督的工作

您好,读了您的文章后,我有一个这样一个问题:您用fgplg生成的 pseudo label和存在的gt 合在一起训练RCRNet 效果会比全部使用 GT效果要好么?因为我发现生成的 pseudo label对应的图片实际上是有GT的。

预训练模型数据集划分

前辈你好,请问对于图像预训练模型的两个数据集,是如何划分测试集、交叉验证集以及训练集的?

训练集

您好
看论文里,你们使用了DAVIS(3455)+VOS(7650)+FBMS(720)总共11825的GT数据,然后在此基础上,每5张图片使用1个GT同时生成1个pseudo label(论文中1/5setting),也就是大概2365GT+2365pseudo label来训练模型。 而不是利用VOS和FBMS的稀疏标注,使用GT来给未标注的数据生成label。请问我这样理解对吗?
感谢您回复

伪标签生成器如何验证在测试集上的性能

前辈您好,最近一直在看您的RCRNet,有两个不太理解的地方。第一是,您的伪标签生成器输入是7个通道,其中有相邻帧的真值,但是在测试集上跑的时候,我们不可能把测试集的真值输入到伪标签生成器啊?那么论文中的伪标签生成器在VOS的test数据集的性能是如何得到的呢(Table 4 )? 第二,为什么伪标签能起到作用呢?它相比于真值,感觉还是不够好。。我是否可以这样理解:伪标签的指导意义远比它的错误信息多,所以能对训练结果有帮助?

Imagesets txt

Hello, may I ask you about how can I get the MSRA-B_id.txt?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.