Code Monkey home page Code Monkey logo

dsrg's Introduction

Weakly-Supervised Semantic Segmentation Network with Deep Seeded Region Growing (CVPR2018)

By Zilong Huang, Xinggang Wang, Jiasi Wang, Wenyu Liu and Jingdong Wang.

This code is a implementation of the weakly-supervised semantic segmentation experiments in the paper DSRG. The code is developed based on the Caffe framework.

Introduction

Overview of DSRG Overview of the proposed approach. The Deep Seeded Region Growing module takes the seed cues and segmentation map as input to produces latent pixel-wise supervision which is more accurate and more complete than seed cues. Our method iterates between refining pixel-wise supervision and optimizing the parameters of a segmentation network.

License

DSRG is released under the MIT License (refer to the LICENSE file for details).

Citing DSRG

If you find DSRG useful in your research, please consider citing:

@inproceedings{huang2018dsrg,
    title={Weakly-Supervised Semantic Segmentation Network with Deep Seeded Region Growing},
    author={Huang, Zilong and Wang, Xinggang and Wang, Jiasi and Liu, Wenyu and Wang, Jingdong},
    booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
    pages={7014--7023},
    year={2018}
}

Installing dependencies

  • Python packages:
      $ pip install -r python-dependencies.txt
  • caffe (deeplabv2 version): deeplabv2 caffe installation instructions are available at https://bitbucket.org/aquariusjay/deeplab-public-ver2. Note, you need to compile caffe with python wrapper and support for python layers. Then add the caffe python path into training/tools/findcaffe.py.

  • Fully connected CRF wrapper (requires the Eigen3 package).

      $ pip install CRF/

Training the DSRG model

  • Go into the training directory:
      $ cd training
      $ mkdir localization_cues
  • Download the initial VGG16 model pretrained on Imagenet and put it in training/ folder.

  • Download CAM seed and put it in training/localization_cues folder. We use CAM for localizing the foreground seed classes and utilize the saliency detection technology DRFI for localizing background seed. We provide the python interface to DRFI here for convenience if you want to generate the seed by yourself.

      $ cd training/experiment/seed_mc
      $ mkdir models
  • Set root_folder parameter in train-s.prototxt, train-f.prototxt and PASCAL_DIR in run-s.sh to the directory with PASCAL VOC 2012 images

  • Run:

      $ bash run.sh

The trained model will be created in models

Acknowledgment

This code is heavily borrowed from SEC.

dsrg's People

Contributors

speedinghzl avatar speedingwisp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dsrg's Issues

Comprehension of pretrained VGG16 model

很好的工作!在使用预训练vgg16的权值时,我发现deploy.prototxt里面包含了ASPP模块,而原始VGG16并不包含这部分。所以我想知道预训练的权值具体是怎么得到的?谢谢!

Understanding Seeding Loss

Trying to understand the seeding loss. In SeedLossLayer, are the probs and labels, the predicted and ground truth segmentation masks?

Failed to load caffe layers

Hi,

I have got this error:
F0730 17:42:12.306586 5815 layer_factory.hpp:81] Check failed: registry.count(type) == 1 (0 vs. 1) Unknown layer type: ImageData (known types: AbsVal, Accuracy, ArgMax, BNLL, BatchNorm, BatchReindex, Bias, Clip, Concat, ContrastiveLoss, Convolution, Crop, Data, Deconvolution, Dropout, DummyData, ELU, Eltwise, Embed, EuclideanLoss, Exp, Filter, Flatten, HDF5Data, HDF5Output, HingeLoss, Im2col, InfogainLoss, InnerProduct, Input, LRN, LSTM, LSTMUnit, Log, MVN, MemoryData, MultinomialLogisticLoss, PReLU, Parameter, Pooling, Power, Python, RNN, ReLU, Reduction, Reshape, SPP, Scale, Sigmoid, SigmoidCrossEntropyLoss, Silence, Slice, Softmax, SoftmaxWithLoss, Split, Swish, TanH, Threshold, Tile)

I have build caffe (make all and make pycaffe) successfully but still got this error. Any ideas?

BTW, I have uncommented WITH_PYTHON_LAYER := 1 in the caffe make file.

MANY THANKS!!!

Localization cues

HI, I cannot successfully install your module DRFI so I use another code by (https://github.com/playerkk/drfi_matlab), and set the value < 0.06 as background seeds as what you write in the paper. However, I cannot reproduce the same results in your paper, the baseline (only with improved localization cues) in your paper is 52.5 mIou, but I only achieved 50.2% mIOU. btw, did the baseline you achieved not include the expand loss? what's the training hyperparamters in the Table 2? and Can you upload the localization cues with bg and fg please ? Thanks a lot

Loss calculation is too slow

Hi,

First of all, thank you very much for providing the implementation of your paper.
I am trying to recreate the results in PyTorch, but I have very slow training time due to the consistency loss computation, which uses pydensecrf, and it takes a lot of time for a moderately sized images (~ 356).
Do you have any recommendations on how to speed things up, and can you please provide the time it took to train the network for in your case.

Thank you very much.

Code about Seed

 Hi, @speedinghzl. can you release the code for generating the seed?Thanks!

Question about seed dynamic updating

It's a very interesting work. We are conducting a similar segmentation task without initial cues file, and I hope to generate seeds in the first epoch and update them in subsequent epoches.
However, in pylayers.py I think the grown seeds (through SRG method) that load from the outside pickel file are directly used to compute seed loss, meaning the cues pickle file needed to be imported at every epoch, which seems different from the paper saying that current seeds map are obtained through the last epoch generated seeds map.
I don't know when and where the previous seeds (not the loaded pickle file) are substituted with the updated seeds and used as the newest seeds, can you help me?
Thank you.

Failed to install CRF

C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:c:\users\appdata\local\continuum\anaconda3\libs /LIBPATH:c:\users\appdata\local\continuum\anaconda3\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\LIB\amd64" "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\ATLMFC\LIB\amd64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.14393.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\NETFXSDK\4.6.1\lib\um\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.14393.0\um\x64" /EXPORT:PyInit_krahenbuhl2013/wrapper build\temp.win-amd64-3.5\Release\krahenbuhl2013/wrapper.obj build\temp.win-amd64-3.5\Release\src/densecrf.obj build\temp.win-amd64-3.5\Release\src/labelcompatibility.obj build\temp.win-amd64-3.5\Release\src/pairwise.obj build\temp.win-amd64-3.5\Release\src/permutohedral.obj build\temp.win-amd64-3.5\Release\src/unary.obj build\temp.win-amd64-3.5\Release\src/util.obj build\temp.win-amd64-3.5\Release\src/densecrf_wrapper.obj /OUT:build\lib.win-amd64-3.5\krahenbuhl2013/wrapper.cp35-win_amd64.pyd /IMPLIB:build\temp.win-amd64-3.5\Release\krahenbuhl2013\wrapper.cp35-win_amd64.lib
LINK : error LNK2001: unresolved external symbol PyInit_krahenbuhl2013/wrapper
build\temp.win-amd64-3.5\Release\krahenbuhl2013\wrapper.cp35-win_amd64.lib : fatal error LNK1120: 1 unresolved externals
error: command 'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\link.exe' failed with exit status 1120

----------------------------------------

Command "c:\users\appdata\local\continuum\anaconda3\python.exe -u -c "import setuptools, tokenize;file='C:\Users\AppData\Local\Temp\pip-req-build-1u64fmyp\setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record C:\Users\AppData\Local\Temp\pip-record-blyy3smc\install-record.txt --single-version-externally-managed --compile" failed with error code 1 in C:\Users\AppData\Local\Temp\pip-req-build-1u64fmyp\

About Seed with person

@speedinghzl ,你好,你的工作真的是太棒了。但是我有一个问题,我只想分割出人类和背景,并不想分割出分割出其他物体,那么能否使用CAM或者DRFI的方法,只定位人的种子,将人定为前景,其他所有的都定位为背景?

Training with custom data

I want to use this code to train with my own data, how do I go about doing that? (Organizing data, labels, etc.)

A juvenil question

Thanks for ur work !!!

when I run the command :
python ../../tools/test-ms-f.py --model models/model-f_iter_20000.caffemodel --images list/val_id.txt --dir /xxx/xxx/xxx --output DSRG_final_output --gpu 0 --smooth true
I got a issue :
Check failed: error == cudaSuccess (2 vs. 0) out of memory
it turns out that the error happens on the line "net.forward()"
I am using nvidia gtx960 4 GB.
This problem didn't occur during the training process
How could I find the batch_size to decrease the using of memory?

why there exist two different solver.prototxt

In the experiment folder, there exist two different solver.prototxt: solver-f and solver-s, may I ask how to use them?
solver-s seems use the 41x41 seeds for training, while solver-f seems to use the DSRG output for training.
Is solver-s represents the main procedure to train DSRG, and solver-f means 'Retrain' procedure refered in your paper?
THX!

comprehension of DSRG

你好, @speedinghzl ,我想问一下,在网络训练过程中,每一次迭代是使用上一次训练后扩展的种子区域作为初始种子区域来不断更新初始种子区域,还是初始种子区域一直保持不变,只是在每次迭代训练网络时,在DSRG层根据初始种子进行生长作为标签计算seed loss?

Reproducing Paper Results

Hi,

I am trying to reproduce the 59.0 mIOU seen in the paper, but so far, all I can achieve is about 54.3 mIOU.

I wrote a script to convert all of the VOC SegmentationClass ground truth from their original RGB form to a black and white format where the pixel intensity corresponds to the alphabetically ordered VOC classes (e.g. 0 is background and 1 is aeroplane, 15 is person, etc).

After doing so and training, I am getting an mIOU of .543.

I then converted the SBD ground truths to the same 0-21 black and white png format and placed those images in my SegmentationClass folder. After re running run.sh, I then got .542 mIOU, very little difference. Perhaps this is the wrong way of including the SBD annotations, but I'm not sure how else I would include them. I suppose to train, I should only need image level labels from SBD, not the whole segmentation, and I may not even need that since it is likely included in the localization_cues-sal.pickle

Do I need to edit any list files or maybe place the SBD files in a different directory? Is there any other data augmentation you used on the VOC and/or SBD data?

If you're interested in how I did the format conversions, you can see the scripts here: https://github.com/mcever/Point-DSRG/tree/master/training/tools/data_prep

Any help you can provide would be greatly appreciated. I'm having a hard time figuring out why the mIOU didn't change much after augmenting with SBD, and I'm not sure why there's still a .05 gap between my results and your report. My best guess is that there is some data augmentation I should do to the JPEGImages, but I suppose it could have to do with SBD data if image level labels are fetched from outside the pickle file during training.

Thanks,
Austin

questions about seed-loss

非常感谢,您的工作真是太棒了!
我在训练过程中发现,迭代2000次后seed-loss开始震荡,constrain-loss基本不变,而且我对比了输出结果,发现dsrg层的输出明显比sec-8层的输出结果要好。您认为会是分割网络性能导致seed-loss不收敛的问题吗

Training Issue likely from Version Conflicts

Hi,

I am trying to use the DSRG code, and I am beginning by trying to train a model myself using the run script; however, I get the following error each time the code tries to import caffe:

$ bash run.sh
/media/ssd1/austin/DSRG/deeplab-public-ver2/python
Traceback (most recent call last):
File "../../tools/train.py", line 12, in
import caffe
File "/media/ssd1/austin/DSRG/deeplab-public-ver2/python/caffe/init.py", line 1, in
from .pycaffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, RMSPropSolver, AdaDeltaSolver
, AdamSolver
File "/media/ssd1/austin/DSRG/deeplab-public-ver2/python/caffe/pycaffe.py", line 15, in
import caffe.io
File "/media/ssd1/austin/DSRG/deeplab-public-ver2/python/caffe/io.py", line 8, in
from caffe.proto import caffe_pb2
File "/media/ssd1/austin/DSRG/deeplab-public-ver2/python/caffe/proto/caffe_pb2.py", line 905, in
options=_descriptor._ParseOptions(descriptor_pb2.FieldOptions(), _b('\020\001')), file=DESCRIPTO$),
TypeError: new() got an unexpected keyword argument 'file'

I found a similar issue here (BVLC/caffe#6143), which suggests that I have the wrong version of something, but I'm trying to work out which dependency is causing the problem. I don't have much experience with caffe, but I expect it's an issue with Ubuntu packages (like libprotobuf or protoc) as I'm using a virtual environment for all my Python packages.

Any help is much appreciated.

About Loss Function

Hi!
Thanks for you work!
In you parer, the seed loss and boundry loss are combined to a single loss:
image
However, I haven't found this process neither in you code or in prototxt file:
image

Is there anything I have missed or misunderstood?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.