Code Monkey home page Code Monkey logo

cgnet's Introduction

Introduction

The demand of applying semantic segmentation model on mobile devices has been increasing rapidly. Current state-of-the-art networks have enormous amount of parameters, hence unsuitable for mobile devices, while other small memory footprint models follow the spirit of classification network and ignore the inherent characteristic of semantic segmentation. To tackle this problem, we propose a novel Context Guided Network (CGNet), which is a light-weight and efficient network for semantic segmentation. We first propose the Context Guided (CG) block, which learns the joint feature of both local feature and surrounding context, and further improves the joint feature with the global context. Based on the CG block, we develop CGNet which captures contextual information in all stages of the network and is specially tailored for increasing segmentation accuracy. CGNet is also elaborately designed to reduce the number of parameters and save memory footprint. Under an equivalent number of parameters, the proposed CGNet significantly outperforms existing segmentation networks. Extensive experiments on Cityscapes and CamVid datasets verify the effectiveness of the proposed approach. Specifically, without any post-processing and multi-scale testing, the proposed CGNet achieves 64.8% mean IoU on Cityscapes with less than 0.5 M parameters.

Installation

  1. Install PyTorch
  • Env: PyTorch_0.4; cuda_9.2; cudnn_7.5; python_3.6
  1. Clone the repository
    git clone https://github.com/wutianyiRosun/CGNet.git 
    cd CGNet
  2. Dataset
├── cityscapes_test_list.txt
├── cityscapes_train_list.txt
├── cityscapes_trainval_list.txt
├── cityscapes_val_list.txt
├── cityscapes_val.txt
├── gtCoarse
│   ├── train
│   ├── train_extra
│   └── val
├── gtFine
│   ├── test
│   ├── train
│   └── val
├── leftImg8bit
│   ├── test
│   ├── train
│   └── val
├── license.txt
  • Download the Camvid dataset. It should have this basic structure.
├── camvid_test_list.txt
├── camvid_train_list.txt
├── camvid_trainval_list.txt
├── camvid_val_list.txt
├── test
├── testannot
├── train
├── trainannot
├── val
└── valannot

Train your own model

For Cityscapes

  1. training on train set
python cityscapes_train.py --gpus 0,1 --dataset cityscapes --train_type ontrain --train_data_list ./dataset/list/Cityscapes/cityscapes_train_list.txt --max_epochs 300
  1. training on train+val set
python cityscapes_train.py --gpus 0,1 --dataset cityscapes --train_type ontrainval --train_data_list ./dataset/list/Cityscapes/cityscapes_trainval_list.txt --max_epochs 350
  1. Evaluation (on validation set)
python cityscapes_eval.py --gpus 0 --val_data_list ./dataset/list/Cityscapes/cityscapes_val_list.txt --resume ./checkpoint/cityscapes/CGNet_M3N21bs16gpu2_ontrain/model_cityscapes_train_on_trainset.pth
  1. Testing (on test set)
python cityscapes_test.py --gpus 0 --test_data_list ./dataset/list/Cityscapes/cityscapes_test_list.txt --resume ./checkpoint/cityscapes/CGNet_M3N21bs16gpu2_ontrainval/model_cityscapes_train_on_trainvalset.pth
  1. Running time on Tesla V100 (single card single batch)
56.8 ms with command "torch.cuda.synchronize()"
20.0 ms w/o command "torch.cuda.synchronize()"

For Camvid

  1. training on train+val set
python camvid_train.py
  1. testing (on test set)
python camvid_test.py

Citation

If CGNet is useful for your research, please consider citing:

  @article{wu2020cgnet,
  title={Cgnet: A light-weight context guided network for semantic segmentation},
  author={Wu, Tianyi and Tang, Sheng and Zhang, Rui and Cao, Juan and Zhang, Yongdong},
  journal={IEEE Transactions on Image Processing},
  volume={30},
  pages={1169--1179},
  year={2020},
  publisher={IEEE}
}

License

This code is released under the MIT License. See LICENSE for additional details.

Thanks to the Third Party Libs

https://github.com/speedinghzl/Pytorch-Deeplab.

cgnet's People

Contributors

wutianyirosun avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cgnet's Issues

GPU utilization problrm

Hello. The num_worker is 1 to default, then the GPU utilization is around 25%. When the num_worker is 8, that is around 60%. However, a new problem occurs:
=====> epoch[0/300] iter: (369/371) cur_lr: 0.000997 loss: 1.175 time:0.23
=====> epoch[0/300] iter: (370/371) cur_lr: 0.000997 loss: 0.740 time:0.23
cityscapes_train.py:42: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
input_var = Variable(input, volatile=True).cuda()
[0/500] time: 0.80
[1/500] time: 0.02
[2/500] time: 0.02
[3/500] time: 0.02
[4/500] time: 0.02
[5/500] time: 0.02
[6/500] time: 0.02
[7/500] time: 0.02
[8/500] time: 0.02
[9/500] time: 0.02
[10/500] time: 0.02
[11/500] time: 0.02
[12/500] time: 0.02
[13/500] time: 0.02
[14/500] time: 0.02
[15/500] time: 0.02
[16/500] time: 0.02
[17/500] time: 0.02
[18/500] time: 0.02
[19/500] time: 0.02
[20/500] time: 0.02
[21/500] time: 0.02
[22/500] time: 0.02
[23/500] time: 0.02
[24/500] time: 0.02
[25/500] time: 0.02
[26/500] time: 0.02
[27/500] time: 0.02
[28/500] time: 0.02
[29/500] time: 0.02
[30/500] time: 0.02
[31/500] time: 0.02
[32/500] time: 0.02
[33/500] time: 0.02
[34/500] time: 0.02
[35/500] time: 0.02
[36/500] time: 0.02
[37/500] time: 0.02
[38/500] time: 0.02
[39/500] time: 0.02
[40/500] time: 0.02
[41/500] time: 0.02
[42/500] time: 0.02
[43/500] time: 0.02
[44/500] time: 0.02
[45/500] time: 0.02
[46/500] time: 0.02
[47/500] time: 0.02
[48/500] time: 0.02
[49/500] time: 0.02
[50/500] time: 0.02
[51/500] time: 0.02
[52/500] time: 0.02
[53/500] time: 0.02
[54/500] time: 0.02
[55/500] time: 0.02
[56/500] time: 0.02
[57/500] time: 0.02
[58/500] time: 0.02
[59/500] time: 0.02
[60/500] time: 0.02
[61/500] time: 0.02
[62/500] time: 0.02
[63/500] time: 0.02
[64/500] time: 0.02
[65/500] time: 0.02
[66/500] time: 0.02
[67/500] time: 0.02
[68/500] time: 0.02
[69/500] time: 0.02
[70/500] time: 0.02
[71/500] time: 0.02
[72/500] time: 0.02
[73/500] time: 0.02
[74/500] time: 0.02
[75/500] time: 0.02

I do not know what this means. Could you tell me that,please?
Thank you.

No effective change in fps even after reducing input image size while training.

Hi

Nice work on CGNet results are fantastic for image size 640x480 I am getting 69% mIoU for citysacpe with 10 classes after training on GTX 1080 with fps of 43.

I though if I reduce the input image size for training i should get around atleast 2x fps gain.

However I was wrong. I got more or less same performance i.e 50 fps.

Could you guide how to tune it for speed ? I am ok with small reduction in mIoU. ?

about cityscapes_inform.pkl

Thanks for sharing your codes!
But I have a question! I can't find "cityscapes_inform.pkl" in the folder, and also the folder "./dataset/wtfile" is not exist. How can I get this file ?

Camvid

For Camvid dataset, there are 0-11, 12 classes, isn't it? If the number of label is less, the confusion matrix seems not to be correct.

can i do inference using CPU

Getting error while running cityscapes_evl.py on Cityscape Data set

when i run with "--gpu 0" , it shows "RuntimeError: CUDA error: out of memory", because of my 2GB GPU. and then, i run without "--cuda " or "--gpu" ,it also looks like "RuntimeError: CUDA error: out of memory",so how can i do test or eval using CPU, thx!

CUDA run time error for the python cityscape train_code

Getting error while training on Cityscape Data set
this my training configuration

python cityscapes_train.py --gpus "3,4" --data_dir ~/data/Cityscape_2017/ --dataset cityscapes --train_type ontrainval --train_data_list ~/data/Cityscape_2017/cityscapes_trainval_list.txt --max_epochs 350 --cuda True --scaleIn 1 --batch_size 4

code ran and printed
=====> use gpu id: '3,4'
====> Random Seed: 457
=====> current architeture: CGNet
=====> computing network parameters
the number of params: 0.50 M
the number of parameters: 496306
data['classWeights']: [ 1.4705521 9.505282 10.492059 10.492059 10.492059 10.492059
10.492059 10.492059 10.492059 10.492059 10.492059 10.492059
10.492059 10.492059 10.492059 10.492059 10.492059 10.492059
5.131664 ]
=====> Dataset statistics
mean and std: [72.3924 82.90902 73.158325] [45.319206 46.15292 44.91484 ]
torch.cuda.device_count()= 2
Got the GPU count
length of dataset is : 3475
length of dataset: 500
=====> no checkpoint found at './checkpoint/cityscapes/CGNet_M3N21bs16gpu2_ontrainval/model_1.pth'
=====> beginning training
=====> the number of iterations per epoch: 868
torch.Size([4, 3, 680, 680])
torch.Size([4, 680, 680])
/home/nithish/my_install/miniconda3/envs/CGNet/lib/python3.6/site-packages/torch/nn/functional.py:2351: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.
warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.")
torch.Size([4, 19, 680, 680])

/opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THCUNN/SpatialClassNLLCriterion.cu:99: void cunn_SpatialClassNLLCriterion_updateOutput_kernel(T *, T *, T *, long *, T *, int, int, int, int, int, long) [with T = float, AccumT = float]: block: [10,0,0], thread: [223,0,0] Assertion t >= 0 && t < n_classes failed.
Traceback (most recent call last):
File "cityscapes_train.py", line 291, in
train_model(args)
File "cityscapes_train.py", line 228, in train_model
lossTr, per_class_iu_tr, mIOU_tr, lr = train(args, trainLoader, model, criteria, optimizer, epoch)
File "cityscapes_train.py", line 100, in train
loss.backward()
File "/home/nithish/my_install/miniconda3/envs/CGNet/lib/python3.6/site-packages/torch/tensor.py", line 102, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/nithish/my_install/miniconda3/envs/CGNet/lib/python3.6/site-packages/torch/autograd/init.py", line 90, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR

What's the difference between channel-wise convolution and depth-wise convolution?

Hi,i am confused with the channel-wise convolution operator. Could you give some suggestions about how to distinguish this?
In your source code, i think it is more similar to depth-conv which is used in MobileNets.

class ChannelWiseConv(nn.Module):
    def __init__(self, nIn, nOut, kSize, stride=1):
        """
        Args:
            nIn: number of input channels
            nOut: number of output channels, default (nIn == nOut)
            kSize: kernel size
            stride: optional stride rate for down-sampling
        """
        super().__init__()
        padding = int((kSize - 1)/2)
        self.conv = nn.Conv2d(nIn, nOut, (kSize, kSize), stride=stride, padding=(padding, padding), groups=nIn, bias=False)

And i found this paper, "ChannelNets: Compact and Efficient ConvolutionalNeural Networks via Channel-Wise Convolutions", which give a definition of "Channel-wise convolution"。https://arxiv.org/abs/1809.01330

What kind of openator is used in CGNet indeed?

*gtFine_labelTrainIds.png

I can't find where is the "*gtFine_labelTrainIds.png" file in the list file. The dataset I downloaded from CityScapes only contains "*gtFine_labelIds.png". Where can I get the "*gtFine_labelTrainIds.png" files? Thanks.

Speed Question(Wrong Speed Test)

Hi!! Thanks for sharing your codes.
I have seen your result in your paper about bi-seg. I tried to reproduce the result of bi-seg, however I only got 71%IOU(single scale), have you successfully got 74% IOU results on that? May be he use ms test.
what is your advantages compared with bi-seg? More Light (Less memory cost)?

License

What is the license of this repository?

关于cityscape数据集重新训练效果

我尝试使用您的代码重新训练了cityscape多次,在测试集上的结果只能达到54%,也尝试过使用您所训练好的模型在验证集上测试,如文章中所说可以取得64%miou。我在阅读您所提供的代码中发现以下问题:
1.你所使用的种子为随机种子。
2.你的dataloader中最终提供的数据类型为numpy,并没有将其转换为tensor格式。(你在cityscape_train文件中声明了transform,但是并没有使用)。
您所提供的代码并非您最终的实验代码。在确定CGNet结构的形式之后,我并没有跑出这个结果的原因主要是因为什么?

Training codes for other models?

Hi i'd like to run performance tests on various GPUs to compare CGNet and BiSeNet. Could you please share the training and inference code for BiSeNet?

Untrack __pycache__

The __pycache__ directory should never be tracked in and added to git. It is to be ignored from git using a .gitignore file. Please add such a file and untrack all added instances of __pycache__.

dataset

I noted that in the code:
image = image[:, :, ::-1]
image -= self.mean
but,cv2.imread's image format is BGR,and self.mean also is BGR ,why need to convert the format image[:, :, ::-1].

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.