oeway / pytorch-deform-conv Goto Github PK

PyTorch implementation of Deformable Convolution

License: MIT License

Python 100.00%

pytorch-deform-conv's Introduction

PyTorch implementation of Deformable Convolution

!!!Warning: There is some issues in this implementation and this repo is not maintained any more, please consider using for example: TORCHVISION.OPS.DEFORM_CONV

By Wei OUYANG @ Institut Pasteur
Thanks to Felix Lau's Keras/TensorFlow implementation: ~~https://github.com/felixlaumon/deform-conv~~ (https://github.com/kastnerkyle/deform-conv)

TODO List

implement offsets mapping in pytorch
all tests passed
deformable convolution module
Fine-tuning the deformable convolution modules
scaled mnist demo
improve speed with cached grid array
use MNIST dataset from pytorch (instead of Keras)
support input image with different width and height
benchmark with tensorflow implementation

Deformable Convolutional Networks

Dai, Jifeng, Haozhi Qi, Yuwen Xiong, Yi Li, Guodong Zhang, Han Hu, and Yichen Wei. 2017. “Deformable Convolutional Networks.” arXiv [cs.CV]. arXiv. http://arxiv.org/abs/1703.06211

The following animation is generated by Felix Lau (with his tensorflow implementation):

Also Check out Felix Lau's summary of the paper: https://medium.com/@phelixlau/notes-on-deformable-convolutional-networks-baaabbc11cf3

pytorch-deform-conv's People

Contributors

Stargazers

Watchers

Forkers

wanjinchang liyangdev rouniuyizu xiaomi2008 sarah20187 andyhx zzutk jxchen01 rbunn80110 feitiandemiaomi fqss0436 zhihengli-ur woyaofeixiang liygcheng zjhthu ginobilinie sinianyutian grseb9s guojm14 jiefloyd choiyeren dl-85 xy0806 tangyoubao ivyxixi stonegiggity shubhampachori12110095 daijucug hdjang touristcheng shuidongliu yongchao-long ivjia pandinosaurus liu3xing3long hq-liu hibiscuses wenyafei4 smilewsw zkghit xiaoliang008 yangyongguang perrywu1989 afcarl trigrass2 junmuzi lxmwust kumapowerliu meimeiainaonao ztyxd shiyongde santiag0m ftorres11 zzqiuzz cigonzalez aznikline yuyangyg dreamyit yogsin aachenhang zrh0712 hehuiguo zhengfangwu guliisgreat xyishere anorthman umariqb xuliwu stanlee321 cjnjuwhy cjwbdw flt19940317 light82 faizwhb nmxnql wmf1991yeah ml-lab leviswind etienne87 jaybladestorm wh-forker zhepherd yandanyang shaunzhuyw junliangma jason4521 kaiduohong myclab cswaynecool long-neck-deer crazystoneonroad lwpyh brain-tumor fengdashuai catchbelief huanyuhello cinly0 littlelittlebin yoojihyeong yangqihang111

pytorch-deform-conv's Issues

GPU mode is not available !

Hello oeway

Firstly, thanks for sharing your pytorch version code of deformable convolution.
When I tried to port your code to another code which uses CUDA mode, it gives me an error saying like below.

'Type torch.cuda.FloatTensor doesn't implement stateless method range'

However, when I change it to CPU mode, there is no error.

It seems like 'torch.range' does not fit on GPU mode in torch. Then what should be the solution?

Thanks in advance

Why only fine tuning on deformable convnet

Hi, why train first on the convnet and only after do finetuning on the deformable convnet?
why not straight traing on deformable (with freezing the ofsets as written in the paper)

How to get the sampling points?

about the implementation.. view instead of permute ?

Hello. Thanks for sharing the code.
I have a question about the implementation of offset, in [https://github.com/oeway/pytorch-deform-conv/blob/master/torch_deform_conv/deform_conv.py#L182]
the code :

offsets = offsets.view(batch_size, -1, 2)

the input tensor offsets in b * (2c) * h * w after first normal conv, I think the offset of defom-conv is the output channels, therefore is the code should be ? :

offsets = offsets.view(b, 2*c, h, w)
offsets = offsets.permute(0, 2, 3, 1)

related to TF_version #4

The issue proposed in issue 4.
I test the situation of : both network is training on the dataset (not deform mnist), the test accuracy

---original net---
test_data:             99.03%
test_scaled_data: 62.22%
---deformable net---
test_data:              98.67%
test_scaled_data: 54.43%

I think the train phase should use undeform data, this can show the advantage of the deformable CNN's advantage.

Offset output channel

pytorch-deform-conv/torch_deform_conv/layers.py

Line 33 in e7a664f

    
           super(ConvOffset2D, self).__init__(self.filters, self.filters*2, 3, padding=1, bias=False, **kwargs)

Your implementation is not the same as the original paper described in section 2.1. Offset output channel dimension should be 2 x N, where N is the conv kernel size k*k in 2D case. That is, the offset is shared across the channel dimension. You could refer to the implementation of https://github.com/ChunhuanLin/deform_conv_pytorch

Gif image visualization

Hi,

is there visualization for the method, as seen in the provided gif file? How would one ideally implement this feature?

related to TF_version #4

The issue proposed in issue 4.
I test the situation of : both network is training on the dataset (not deform mnist), the test accuracy

---original net---
test_data:             99.03%
test_scaled_data: 62.22%
---deformable net---
test_data:              98.67%
test_scaled_data: 54.43%

I think the train phase should use undeform data, this can show the advantage of the deformable CNN's advantage.

The deform-conv layers reduce the detection accuracy

I trained your deform-conv on the origin data. It gets 70% on origin test and 89% on scaled data, while the the normal CNN model trained on origin data gets 99% on origin test and 64% on scaled data. The deform-conv may not work.
So I try to delete two of the deform-conv, keep one left. It gets 93% on origin test and 97% on scaled data.
Your deform-conv may not work. I think you should check it.

Input image w!=h

Hi~ Does anyone know what if input image w！=h, e.g h=256,w=128. where shoud i change?

Weighting in deformed kernel

I am a little bit confused how the weighting is done of each input element of the deformed kernel in your implementation.

In your implementation it looks like you are first calculating the offsets and with these rearrange the input in the ConvOffset2D module (so ConvOffset2D just outputs the rearranged features according to the learned offsets). After that, on the rearranged feature maps you apply a normal Conv2D, which should then do the actual weighting for the deformed kernel, right?

Is this the same as in the original paper where I understand it that way that the weighting is applied directly to the deformed input elements (kernel elements with offsets) , like

sum_p_n[ w(p_n) * x(p_0 + p_n + p_delta)]

I don't see if this is the same or if this has a different assumption on deformable confolutions than the original paper.

Confusion about the shape of offset

Hello,

I am confused about the shape of offset.
The paper mentions:
"The grid R defines the receptive field size and dilation. For example,R = {(−1,−1),(−1,0),...,(0,1),(1,1)}.
In deformable convolution, the regular grid R is aug- mented with offsets {∆pn |n = 1, ..., N }, where N = |R|.
The output offset fields have the same spatial resolution with the input feature map. The channel dimension 2N corresponds to N 2D offsets."
So, I think the shape of offset field would be [29, H, W] if 3x3 kernel is used.
While in your implementation, the shape of offset seems to be [batch_size, 2n_channels, H, W]?

related to TF_version #4

The issue proposed in issue 4.
I test the situation of : both network is training on the dataset (not deform mnist), the test accuracy

---original net---
test_data:             99.03%
test_scaled_data: 62.22%
---deformable net---
test_data:              98.67%
test_scaled_data: 54.43%

I think the train phase should use undeform data, this can show the advantage of the deformable CNN's advantage.

Offset BUG

The paper has offset with shape [B, 2x9, H, W]
But this repo has offset with shape [B, 2xC, H, W]
I think this means two totally different things.

How about the time consumption with respect to regular convolution

Hello, how much time it will take in pytorch?

indexing with a detached variable

Hi oeway,

Any chance you can help me understand your code? On this line, you index the input with a detached variable, so I'm wondering how you propagate the gradient backward through the vals_lt, etc.. It seems like mapped_vals would not have any parent nodes with gradients? Does that make sense? When I try to do a similar thing here for a spatial transformer network, it gives me a no nodes require gradients error.

Do you get around this by freezing the entire network? I feel like you would get the same error if the network wasn't frozen. Any insight you can provide into this would be appreciated.

EDIT: Ok, I get that the gradient propagates through the coords_offset_lt value... can you describe where you got this interpolation algorithm from? Thanks :)

Question about implementation

in file deform_conv.py
inds = indices[:, 0]*input.size(1)*input.size(2)+ indices[:, 1]*input.size(2) + indices[:, 2]
what does it mean ?

A re-train normal CNN on scaled data beats deform-conv.

First, appreciate for your work which is easy to use and read.
In scaled_mninst.py, the normal CNN model is trained on origin data and then, tested on scaled one. It shows a bad acc of 60% (In my running). Then, you fine-tune a deform-conv on the scaled data and its accuracy is much better. However, I tried to re-train this trained CNN model on scaled data and the result confuses me definitely. It gets 96% on origin test and 98% on scaled data.
Well, this experiment can not prove the effectiveness of deform-conv layers.

oeway / pytorch-deform-conv Goto Github PK

pytorch-deform-conv's Introduction

PyTorch implementation of Deformable Convolution

TODO List

Deformable Convolutional Networks

pytorch-deform-conv's People

Contributors

Stargazers

Watchers

Forkers

pytorch-deform-conv's Issues

Recommend Projects

Recommend Topics

Recommend Org