Code Monkey home page Code Monkey logo

sinet's Introduction

Camouflaged Object Detection (CVPR2020-Oral)

Authors: Deng-Ping Fan, Ge-Peng Ji, Guolei Sun, Ming-Ming Cheng, Jianbing Shen, Ling Shao.

0. Preface

  • Welcome to joint the COD community! We create a group chat in WeChat, you can join it via adding contact (WeChat ID: CVer222). Please attach your affiliations.

  • This repository includes detailed introduction, strong baseline (Search & Identification Net, SINet), and one-key evaluation codes for Camouflaged Object Detection (COD).

  • For more information about Camouflaged Object Detection, please visit our Project Page and read the Manuscript (PDF) / Chinese Version (PDF).

  • If you have any questions about our paper, feel free to contact Deng-Ping Fan or Ge-Peng Ji via E-mail. And if you are using SINet or evaluation toolbox for your research, please cite this paper.

0.1. 🔥 NEWS 🔥

  • [2021/07/07] 💥 The latest enhanced version of SINet is coming, which is accepted at IEEE TPAMI 2022 (Paper | GitHub). The SINet-V2 can surpass the performance of existing COD methods by a large margin, while maintaining real-time inference.
  • [2020/10/22] 💥 Training code could be avaliable via email ([email protected]). Please provide your Name & Institution. Please note the code can be only used for research purpose.
  • [2020/11/21] Upadted evaluated tool: Bi_cam(cam>threshold)=1 -> Bi_cam(cam>=threshold)=1;
  • [2020/10/22] For eq (4): j = k+1, M -> j = m, k-1. (note that m is a specific layer, in our paper it should be equal to 1).
  • [2020/09/09] SINet is the best method on the open benchmark website (https://paperswithcode.com/task/camouflaged-object-segmentation).
  • [2020/08/27] Updated the describtion in Table 3 (Baseline models are trained using the training setting (iii) rather than (iv)).
  • [2020/08/05] Online demo has been released! (http://mc.nankai.edu.cn/cod).
  • [2020/06/11] We re-organize the training set, listed in 2.2. Usage section, please download it again.
  • [2020/05/05] 💥 Release testing code.
  • [2020/04/25] Training/Testing code will be updated soon ...

0.2. Table of Contents

0.3. File Structure

SINet
├── EvaluationTool
│   ├── CalMAE.m
│   ├── Enhancedmeasure.m
│   ├── Fmeasure_calu.m
│   ├── main.m
│   ├── original_WFb.m
│   ├── S_object.m
│   ├── S_region.m
│   └── StructureMeasure.m
├── Images
│   ├── CamouflagedTask.png
│   ├── CamouflagingFromMultiView.png
│   ├── CmpResults.png
│   ├── COD10K-2.png
│   ├── COD10K-3.png
│   ├── COVID'19-Infection.png
│   ├── locust detection.png
│   ├── new_score_1.png
│   ├── PolypSegmentation.png
│   ├── QuantitativeResults-new.png
│   ├── SampleAquaticAnimals.png
│   ├── Search-and-Rescue.png
│   ├── SINet.png
│   ├── SubClassResults-1.png
│   ├── SubClassResults.png
│   ├── Surface defect Detection2.png
│   ├── TaskRelationship.png
│   ├── Telescope.png
│   └── UnderwaterEnhancment.png
├── MyTest.py
├── README.md
├── requirement.txt
└── Src
    ├── backbone
    ├── __init__.py
    ├── SearchAttention.py
    ├── SINet.py
    └── utils

1. Task Relationship


Figure 1: Task relationship. Given an input image (a), we present the ground-truth for (b) panoptic segmentation (which detects generic objects including stuff and things), (c) salient object detection (which detects isolated objects that grasp human attention), and (d) the proposed concealed object detection task, where the goal is to detect objects that have a similar pattern to the natural habitat. In this example, the boundaries of the two butterflies are blended with the bananas, making them difficult to identify..


Figure 2: Given an input image (a), we present the ground-truth for (b) panoptic segmentation (which detects generic objects including stuff and things), (c) salient instance/object detection (which detects objects that grasp human attention), and (d) the proposed camouflaged object detection task, where the goal is to detect objects that have a similar pattern (e.g., edge, texture, or color) to the natural habitat. In this case, the boundaries of the two butterflies are blended with the bananas, making them difficult to identify. This task is far more challenging than the traditional salient object detection or generic object detection.

References of Salient Object Detection (SOD) benchmark works
[1] Video SOD: Shifting More Attention to Video Salient Object Detection. CVPR, 2019. (Project Page)
[2] RGB SOD: Salient Objects in Clutter: Bringing Salient Object Detection to the Foreground. ECCV, 2018. (Project Page)
[3] RGB-D SOD: Rethinking RGB-D Salient Object Detection: Models, Datasets, and Large-Scale Benchmarks. TNNLS, 2020. (Project Page)
[4] Co-SOD: Taking a Deeper Look at the Co-salient Object Detection. CVPR, 2020. (Project Page)

2. Proposed Baseline

2.1. Overview


Figure 3: Overview of our SINet framework, which consists of two main components: the receptive field (RF) and partial decoder component (PDC). The RF is introduced to mimic the structure of RFs in the human visual system. The PDC reproduces the search and identification stages of animal predation. SA = search attention function described in [71]. See x 4 for details.

2.2. Usage

The training and testing experiments are conducted using PyTorch with a single GeForce RTX TITAN GPU of 24 GB Memory.

Note that our model also supports low memory GPU, which means you can lower the batch size (~419 MB per image in apex-mode=O1, and ~305 MB per image in apex-mode=O2)

  1. Configuring your environment (Prerequisites):

    Note that SINet is only tested on Ubuntu OS with the following environments. It may work on other operating systems as well but we do not guarantee that it will.

    • Creating a virtual environment in terminal: conda create -n SINet python=3.6.

    • Installing necessary packages: pip install -r requirements.txt.

    • (Optional: only for training) Installing NVIDIA-Apex for accelerate training process with mixed precision. (Instructions) (Under CUDA-10.0 and Cudnn-7.4).

  1. Downloading Training and Testing Sets:
    • downloading NEW testing dataset (COD10K-test + CAMO-test + CHAMELEON) and move it into ./Dataset/TestDataset/, which can be found in this Google Drive link or Baidu Pan link with the fetch code: z83z.

    • download NEW training dataset (COD10K-train) which can be found in this Google Drive link or Baidu Pan link with the fetch code:djq2. Please refer to our original paper for other training data.

  1. Testing Configuration:

    • After you download all the pre-trained model and testing data, just run MyTest.py to generate the final prediction map: replace your trained model directory (--model_path) and assign your the save directory of the inferred mask (--test_save)

    • Note that we re-trained our model (marked as $\diamondsuit$ in the following figure) equipped with mixed training strategy of Apex lib (mode=O1) and get better performance in 40 epoch. Here we provide a new pre-trained model (Baidu Drive [fetch code:2pp2]/Google Drive) here. Later, We will try different backbones based SINet to improve performance and provide more comprehensive comparison.


  2. Evaluation your trained model:

    • One-key evaluation is written in MATLAB code (revised from link), please follow this the instructions in main.m and just run it to generate the evaluation results in ./EvaluationTool/EvaluationResults/Result-CamObjDet/.

3. Results

3.1. Qualitative Comparison


Figure 4: Qualitative results of our SINet and two top-performing baselines on COD10K. Refer to our paper for details.

3.2. Quantitative Comparison (Overall/Sub-class)


Table 1: Quantitative results on different datasets. The best scores are highlighted in bold.


Table 2: Quantitative results of Structure-measure (Sα) for each sub-class in our COD10K dataset-(1/2). The best score of each category is highlighted in bold.


Table 3: Quantitative results of Structure-measure (Sα) for each sub-class in our COD10K dataset-(2/2). The best score of each category is highlighted in bold.

3.3. Results Download

  1. Results of our SINet can be found in this download link.

  2. Performance of competing methods can be found in this download link.

4. Proposed COD10K Datasets


Figure 5: The extraction of individual samples including 20 sub-classes from our COD10K (2/5)–Aquatic animals.


Figure 6: Annotation diversity and meticulousness in the proposed COD10K dataset. Instead of only providing coarse-grained object-level annotations with the three major types of bias (e.g., Watermark embedded, Coarse annotation, and Occlusion) in prior works, we offer six different annotations, which include edge-level (4rd row), object-level (5rd row), instance-level (6rd row), bounding boxes (7rd row), and attributes (8rd row). Refer to the manuscript for more attribute details.


Figure 7: Regularized quality control during our labeling reverification stage. Strictly adheres to the four major criteria of rejection or acceptance to near the ceiling of annotation accuracy.

COD10K datasets: Baidu aq4i | Google

5. Evaluation Toolbox

We provide complete and fair one-key evaluation toolbox for benchmarking within a uniform standard. Please refer to this link for more information: Matlab version: https://github.com/DengPingFan/CODToolbox Python version: https://github.com/lartpang/PySODMetrics

6. Potential Applications

  1. Medical (Polyp Segmentation and COVID-19 Infection Segmentation Diagnose) Please refer to this page (https://github.com/DengPingFan/Inf-Net) for more details.


Figure 8: Lung Infection Segmentation.


Figure 9: Example of COVID-19 infected regions in CT axial slice, where the red and green regions denote the GGO, and consolidation, respectively. The images are collected from here. (COVID-19 CT segmentation dataset (link: https://medicalsegmentation.com/covid19/, accessed: 2020-04-11).)

  1. Agriculture (locust detection to prevent invasion)


Figure 10: Locust disaster detection.

  1. Art (e.g., for photorealistic blending, or recreational art)


Figure 11: The answer can be found at here (Camouflaging an Object from Many Viewpoints, CVPR 2014.)

  1. Computer Vision (e.g., for search-and-rescue work, or rare species discovery)


Figure 13: Search and Rescue for saving lives.

  1. Underwater Image Enhancement


Figure 14: Please refer to "An Underwater Image Enhancement Benchmark Dataset and Beyond, TIP2019" for more details.

  1. Surface defect Detection


Figure 15: Please refer to "A review of recent advances in surface defect detection using texture analysis techniques, 2008" for more details.

## 7. User Study Test

--> Click here to explore more interest things (YouTube Link) <--

8. Citation

Please cite our paper if you find the work useful:

@inproceedings{fan2020Camouflage,
title={Camouflaged Object Detection},
author={Fan, Deng-Ping and Ji, Ge-Peng and Sun, Guolei and Cheng, Ming-Ming and Shen, Jianbing and Shao, Ling},
booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2020}
}

9. LICENSE

  • The COD10K Dataset is made available for non-commercial purposes only.

  • You will not, directly or indirectly, reproduce, use, or convey the COD10K Dataset or any Content, or any work product or data derived therefrom, for commercial purposes.

This code is for academic communication only and not for commercial purposes. If you want to use for commercial please contact me.

Redistribution and use in source with or without modification, are permitted provided that the following conditions are met:

  • Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

  • Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

10. Acknowledgements

We would like to thank authors of CHAMELEON, and CAMO dataset for their work. They provide tremendous efforts in these dataset to boost this field. We also appreciate image annotators and Wenguan Wang, Geng Chen, Hongsong Wang for insightful feedback and discussion.

11. TODO LIST

If you want to improve the usability or any piece of advice, please feel free to contact me directly (E-mail).

  • Support NVIDIA APEX training.

  • Support different backbones ( VGGNet, ResNet, ResNeXt Res2Net, iResNet, and ResNeSt etc.)

  • Support distributed training.

  • Support lightweight architecture and real-time inference, like MobileNet, SqueezeNet.

  • Support distributed training

  • Add more comprehensive competitors.

12. FAQ

  1. If the image cannot be loaded in the page (mostly in the domestic network situations).

    Solution Link


⬆ back to top

sinet's People

Contributors

dengpingfan avatar gewelsji avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sinet's Issues

数据集统计图

想问一下论文中关于数据集统计信息部分的归一化目标大小图、中心偏差图和全局、局部对比分布图是怎么绘制来的呢?

torch版本

您好!train的时候,pytorch版本是多少?

train.py文件

作者你好,你们给的是直接测试数据集,我想自己训练,你们有类似train.py文件吗

多类别适配问题

作者您好,感谢您开源代码,有一些问题想向您请教一下:
代码所用数据集的Label的值为0(背景)和255(前景),而我自己的数据集的label值为0(背景)1、2、3......(多类别),我修改了NCD和GRA,添加了类别数量参数,以及loss函数修改为CE,但是最终效果不理想,不知是否可以这么修改?或者这么修改存在什么问题?

多类别适配问题

作者您好,感谢您开源代码,有一些问题想向您请教一下:
代码所用数据集的Label的值为0(背景)和255(前景),而我自己的数据集的label值为0(背景)1、2、3......(多类别),我修改了NCD和GRA,添加了类别数量参数,以及loss函数修改为CE,但是最终效果不理想,不知是否可以这么修改?或者这么修改存在什么问题?

Can I test my own dataset

I would like to test my own dataset, but I don't have the GT.

Why do I need them, for validation?

MyTrain

where is "MyTrian.py"??????

关于论文的一些疑问

尊敬的作者您好,
在读了您的工作以后,我有几点模糊的地方想请您明确一下:
1、您在github上给出的中/英文论文与cvf open Access上的论文在内容上有些微出入,例如,Sec.5.1删除了trainning set(i)CPD1K,Table3删除了CPD1K-Test的内容,可能还有其他地方我没有注意到。我的问题是,github上给出的版本是终版吗(我应该以这一版论文数据为准吗)?
2、您的数据集文中称收集了10000张数据,但是下载后发现trainning+testing set =4040+2026<10000?根据您论文中所述的COD10k的test set为2026张iamge,您给出的trainning set是否有缺漏?只对于COD10K来说,您的模型是在4040张数据上进行的训练还是在7974(=10000-2026)张数据上进行了训练?(如果是由于我个人漏读,我表示抱歉)
3、您最新版给出的论文中,trainning setting(iii)中,使用了CAMO + COD10K + EXTRA作为训练集,这个EXTRA是指什么?(CPD1K吗?),至于COD10K,同样有使用4040张还是7974张数据的疑问。
4、您的模型在使用imagenet预训练后,(在COD等训练集上进行训练之前)是否在SOD训练集上finetune过?
5、您更新了表格的内容并称40epoch后取得了更好的效果,那么,假如后续有新的工作需要跟您的模型进行对比,是cite您论文原论文中的数据还是您更新后的数据呢?

SINet.py question

Hello doctor Deng, could you help me?
In the paper, there is a Bconv1x1 after Concatenation C in the RF module. But in the SINet.py, the code you given is like this: self.conv_cat = BasicConv2d(4*out_channel, out_channel, 3, padding=1), then the kernel size is 3. So I want to know if is there any problem here? Or just because I did not get the idea?

Loss

weit = 1+5*torch.abs(F.avg_pool2d(mask, kernel_size=31, stride=1, padding=15)-mask)
您好,您论文使用了F3Net的损失函数,请问kernel_size是否是超参数,31是如何选取的,能否选取51,15,9等数字。感谢解答!

关于单通道训练的问题

作者您好,我的训练集图像为单通道,而您的图像为三通道。我将预训练resnet的第一层已经改成了单通道输入,但是无法正确运行。我想请教一下如何更改以适应单通道分割?已经给您发邮件希望您提供train.py作参考。

Error in pip install requirement.txt

I am trying to test SINet in Windows machine. When I executed the command pip install requirement.txt , it is giving me error :
ERROR: Could not find a version that satisfies the requirement torch==1.3.1 (from versions: 1.7.0, 1.7.1, 1.8.0, 1.8.1, 1.9.0, 1.9.1, 1.10.0, 1.10.1, 1.10.2)
ERROR: No matching distribution found for torch==1.3.1

Please help me resolve this issue in Windows machine

trian.py

作者您好,我有一些遥感数据想自己训练一下,请问可以提供train.py文件吗,感谢

SINET用于伪装目标检测

我想把这个网络用于伪装目标的检测,请问能否替代现有通用目标检测框架中的主干部分?作者有试过与通用框架的监测结果相比效果如何吗?
或者用SINET网络有别的方式能实现目标检测吗?

关于搭建项目环境缺少requirements.txt

尊敬的作者您好:
在阅读了您的论文后,我想要搭建项目环境复现一下您的测试数据,但是在pip install -r requirements.txt这一步中,我在仓库里并没有找到requirements.txt这一文件,不知道您是否注意到并提供一下。感谢您的工作成果。

多类别适配问题

作者您好,感谢您开源代码,有一些问题想向您请教一下:
代码所用数据集的Label的值为0(背景)和255(前景),而我自己的数据集的label值为0(背景)1、2、3......(多类别),我修改了NCD和GRA,添加了类别数量参数,以及loss函数修改为CE,但是最终效果不理想,不知是否可以这么修改?或者这么修改存在什么问题?

Make my datesets for the code

I am interested in the paper and code.
I want to train the code on my datesets, but I don’t know how to make the annotation, can you give some tools or ohter advises. Thanks !!!!!!

training setting (iv)

Hi,

In Table 3, there are three training settings for SINet, and you mention that the baseline models are trained using the training setting (iv), but I cannot find any description about this setting in the paper. Is it because I miss something?

Downloading pre-trained model

I was trying to run and test this repository. I want to use a pre-trained SINet model. I am not able to figure out how to download the pre-trained model as mentioned in the step 3 testing configuration in the Readme.

Donw the pre-trained model?

I try to download the pre-trained model by Baidu Drive ,but it's need a verification code. And I didn't find it in the readme.

关于归一化

cam = (cam - cam.min()) / (cam.max() - cam.min() + 1e-8)

为什么要这样归一化?如果输入的预测图片是一个负样本,没有正响应值,这样归一化是否不合理?

instance-level segmentation

Hello, I have noticed that there are already instance-level image annotations in COD10K. Can SINet support instance-level segmentation now? If not, I want to try to improve it to the instance level. Please provide your suggestions. Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.