eric-mingjie / network-slimming Goto Github PK

View Code? Open in Web Editor NEW

907.0 14.0 215.0 24 KB

Network Slimming (Pytorch) (ICCV 2017)

License: MIT License

Python 100.00%

deep-learning convolutional-neural-networks pytorch channel-pruning sparsity

network-slimming's People

Contributors

Stargazers

Watchers

Forkers

hyzcn zhly0 liuzhuang13 baiyancheng20 keyky trendingtechnology jianyuheng irvingshu holyhao worldhellooo ashutosh-adhikari khurramhazen lith0613 yzp17121579 bl166 tjussh wang93 dukebw zzqiuzz klqulei facexteam dingjie1993 tangal0203 mengrang mrlinning nature0310 shentanyue vickymodiface deargqy youngbaby123 dk0893 tuq820 zhi-yu ggchencan lijian10086 euminds 666dzy666 mingsun-tse zhongshaoyy qianlinjun captain1986 jinliemma sxldj beckgom csjunxu sanster secretdragon irfanicmll zlannnn crazyvertigo realtainyi ralph-finn hitzht amore-hdu fengxingxiang amengpp mostafaelhoushi cxm1995 qiu931110 mentorezio lucky666123 haiyang-tju wkkyle tlzhao-casia buendilong tigermachinelearning lufeng22 wen0618 bruceby zrh0712 snownus jiangbingqing tangbohu lee-seon-woo zhangming8 aihekukafeidexiaoafei happyxuwork leo-xxx wangzz313 henlong lebyni azuredsky flyinglsj lliai leiqing110 leoozy sp2-hybrid qifei123 liguang190223 rivanrashid changle2018 xiaoye77 swpsgithub ustblc jgdshkovi asnvin kongcong swan2015 longmarch7 imshenzhuo

network-slimming's Issues

Question about flops calculation

https://arxiv.org/pdf/1708.06519.pdf
How do you calculate table 1 FLOPs?

I use https://github.com/Lyken17/pytorch-OpCounter to calculate baseline vggnet (vgg-19) flops, I got 399718400.0

However Network Slimming paper says 7.97×10e8 -- almost double the flops

If I apply this flops counter to vgg-16, I got matching result with layerwise flops from
L1 Pruning ConvNet paper https://arxiv.org/abs/1608.08710

Where is the model stored after ''Train with Sparsity'' or ''Baseline''?

After training, is there any ckpt file? If there is , plz give me the dir .

Question about resprune

Thanks for sharing!
I have trained a preresnet-164 use python main.py -sr --s 0.00001 --dataset cifar10 --arch resnet --depth 164 , and I want to prune the model with python resprune.py --dataset cifar10 --depth 164 --percent 0.4 --model [My model path] --save [My path], But it has some errors and I don't know how to solve.
`Test set: Accuracy: 9262/10000 (92.6%)

Cfg:
[10, 15, 16, 24, 15, 16, 7, 9, 11, 17, 15, 15, 23, 15, 15, 8, 12, 15, 19, 15, 16, 20, 15, 15, 13, 12, 16, 21, 15, 16, 14, 15, 15, 18, 13, 16, 17, 12, 15, 13, 13, 15, 1, 4, 7, 9, 12, 16, 18, 12, 15, 25, 15, 16, 35, 32, 32, 38, 31, 32, 47, 32, 32, 44, 31, 32, 51, 30, 32, 43, 31, 32, 38, 31, 32, 43, 32, 32, 54, 32, 32, 70, 32, 32, 51, 32, 32, 52, 32, 32, 51, 31, 32, 52, 32, 32, 52, 30, 32, 57, 32, 32, 51, 30, 32, 61, 31, 32, 125, 64, 64, 59, 60, 64, 68, 60, 64, 66, 63, 64, 78, 62, 64, 89, 64, 64, 106, 63, 64, 119, 64, 64, 123, 64, 64, 149, 64, 64, 135, 64, 64, 144, 64, 63, 127, 64, 63, 139, 64, 63, 144, 64, 64, 135, 64, 64, 141, 63, 64, 141, 64, 63, 85]
Traceback (most recent call last):
File "resprune.py", line 181, in
m1.weight.data = m0.weight.data.clone()
File "/home/ubuntu/anaconda3/envs/YOLACT/lib/python3.6/site-packages/torch/nn/modules/module.py", line 585, in getattr
type(self).name, name))
AttributeError: 'Sequential' object has no attribute 'weight'`

Can someone give me some suggestions?
Thanks very much!

关于论文中稀疏训练的损失函数的疑惑

大神，你好。我想问一个问题，我觉的论文《Learning Efficient Convolutional Networks Through Network Slimming》中给出的损失函数是针对需要剪枝的BN层的，而网络的最后层的损失函数还是经典的yolov3的损失函数，可以这样理解吗？根据代码的意思，最后的loss依然是经典的yolov3的损失函数值，没有加入L1正则的损失值。如果在网络的最后的损失函数是论文中的公式话，那么应该对每层的反向梯度就都应该包含L1正则的梯度

期待您的回复。十分感谢

剪枝程序vggprune.py遇见的问题

Traceback (most recent call last):
File "vggprune.py", line 73, in
mask = weight_copy.gt(thre).float().cuda()
RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.FloatTensor for argument #2 'other'
我用的环境是python3.6 torch0.4.1

About sparsity regularization

Thank you for your sharing. I have a question about sparsity regulization.
As is shown in the formula (1) in the paper, g(s) is added in the loss function. But in the code, I don't find g(s) in the loss funtion. I only find the additional gradient about scaling factor is added to original gradient.
Could you show me where you add the g(s) to the loss function in the code?

channel selection layer

what the use of channel selection layer for resnet？

python vggprune.py

when i run the file of vggprune.py, there's a fault: 'unexpected key "module.feature.0.weight" in state_dict'~
i think that's the model's name is wrong. what i should do?

gamma 稀疏问题

你好，我训练完vgg16后发现所有 gamma的值大小从0.1-0.8，好像并不稀疏

请问怎么统计inference的时间？

satrt_time = time.time()
output = model(data)
end_time = time.time()
total_time+=(end_time-satrt_time)
为什么我这样统计，剪枝之后inference速度和不剪枝没说明区别，并没有明显提高？

code error: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number

code error:

invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number

fix:
replace .data[0] in main.py function train() and test() with .item()

关于稀疏化训练的一个小问题

论文采用的是对BN中缩放因子进行L1稀疏化，我看了代码发现好像和其他的过程的L1稀疏化不一样，torch中设置weight_decay默认是所有参数L2正则化训练

请问这个是怎么体现L1正则化的

您好，请问一下再剪枝的时候，为什么对bn的权重取了绝对值？

vgg_prune.py

hey Eric. i got this error during pruning , can you suggest me how should i handle it. Thanks

Traceback (most recent call last):
File "vggprune.py", line 123, in
newmodel = vgg(dataset=args.dataset, cfg=cfg)
File "/data/hzm/network-slimming-master/models/vgg.py", line 22, in init
self.feature = self.make_layers(cfg, True)
File "/data/hzm/network-slimming-master/models/vgg.py", line 39, in make_layers
conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1, bias=False)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/conv.py", line 315, in init
False, pair(0), groups, bias)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/conv.py", line 43, in init
self.reset_parameters()
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/conv.py", line 47, in reset_parameters
init.kaiming_uniform(self.weight, a=math.sqrt(5))
File "/usr/local/lib/python3.5/dist-packages/torch/nn/init.py", line 288, in kaiming_uniform_
fan = _calculate_correct_fan(tensor, mode)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/init.py", line 257, in _calculate_correct_fan
fan_in, fan_out = _calculate_fan_in_and_fan_out(tensor)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/init.py", line 191, in _calculate_fan_in_and_fan_out
receptive_field_size = tensor[0][0].numel()
IndexError: index 0 is out of bounds for dimension 0 with size 0

Question about resnet50

Hi Eric,
Since I need to train a ResNet50 with less parameter, i read the code of definition of ResNet in your code. I found that it is pretty different from official code of Pytorch torchvision model. Could you please tell how i can change the code so i can train a pruned ResNet50?

why the test of the resnet164 model is slow?

why the test of the resnet164 model is slow?Its flops and parameters are smaller than resnet18(official model of Pytorch for cifar10),but it is slower。。。

because of channel_selection?

Can the algorithm prune the mobilenet v1 or v2

Thanks!

RuntimeError: Given weight of size [9, 3, 3, 3], expected bias to be 1-dimensional with 9 elements, but got bias of size [128] instead

@Eric-mingjie : First of all thank you very much. When I try to prune my architecture(Simpnet) , half way through the pruning it crashes with the error :
RuntimeError: Given weight of size [9, 3, 3, 3], expected bias to be 1-dimensional with 9 elements, but got bias of size [128] instead

What is wrong here? Can you kindly assist me in resolving this issue ?
Here is the whole log :

=> loading checkpoint 'model_best_simpnet8.pth.tar'
=> loaded checkpoint 'model_best_simpnet8.pth.tar' (epoch 101) Prec1: 96.120000
simpnet8m(
  (features): Sequential(
    (0): Conv2d(3, 128, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (1): BatchNorm2d(128, eps=1e-05, momentum=0.05, affine=True)
    (2): ReLU(inplace)
    (3): Dropout2d(p=0.02)
    (4): Conv2d(128, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (5): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (6): ReLU(inplace)
    (7): Dropout2d(p=0.05)
    (8): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (9): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (10): ReLU(inplace)
    (11): Dropout2d(p=0.05)
    (12): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (13): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (14): ReLU(inplace)
    (15): Dropout2d(p=0.05)
    (16): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (17): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (18): ReLU(inplace)
    (19): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), dilation=(1, 1), ceil_mode=False)
    (20): Dropout2d(p=0.05)
    (21): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (22): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (23): ReLU(inplace)
    (24): Dropout2d(p=0.05)
    (25): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (26): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (27): ReLU(inplace)
    (28): Dropout2d(p=0.05)
    (29): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (30): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (31): ReLU(inplace)
    (32): Dropout2d(p=0.05)
    (33): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (34): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (35): ReLU(inplace)
    (36): Dropout2d(p=0.05)
    (37): Conv2d(182, 430, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (38): BatchNorm2d(430, eps=1e-05, momentum=0.05, affine=True)
    (39): ReLU(inplace)
    (40): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), dilation=(1, 1), ceil_mode=False)
    (41): Dropout2d(p=0.1)
    (42): Conv2d(430, 430, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (43): BatchNorm2d(430, eps=1e-05, momentum=0.05, affine=True)
    (44): ReLU(inplace)
    (45): Dropout2d(p=0.1)
    (46): Conv2d(430, 455, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (47): BatchNorm2d(455, eps=1e-05, momentum=0.05, affine=True)
    (48): ReLU(inplace)
    (49): Dropout2d(p=0.1)
    (50): Conv2d(455, 600, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (51): BatchNorm2d(600, eps=1e-05, momentum=0.05, affine=True)
    (52): ReLU(inplace)
  )
  (classifier): Linear(in_features=600, out_features=10, bias=True)
)
layer index: 3 	 total channel: 128 	 remaining channel: 9
layer index: 7 	 total channel: 182 	 remaining channel: 17
layer index: 11 	 total channel: 182 	 remaining channel: 44
layer index: 15 	 total channel: 182 	 remaining channel: 34
layer index: 19 	 total channel: 182 	 remaining channel: 89
layer index: 24 	 total channel: 182 	 remaining channel: 128
layer index: 28 	 total channel: 182 	 remaining channel: 102
layer index: 32 	 total channel: 182 	 remaining channel: 86
layer index: 36 	 total channel: 182 	 remaining channel: 95
layer index: 40 	 total channel: 430 	 remaining channel: 9
layer index: 45 	 total channel: 430 	 remaining channel: 89
layer index: 49 	 total channel: 455 	 remaining channel: 8
layer index: 53 	 total channel: 600 	 remaining channel: 339
Pre-processing Successful!
Files already downloaded and verified

Test set: Accuracy: 1000/10000 (10.0%)

[9, 17, 44, 34, 89, 'M', 128, 102, 86, 95, 9, 'M', 89, 8, 339]
In shape: 3, Out shape 9.
In shape: 9, Out shape 17.
In shape: 17, Out shape 44.
In shape: 44, Out shape 34.
In shape: 34, Out shape 89.
In shape: 89, Out shape 128.
In shape: 128, Out shape 102.
In shape: 102, Out shape 86.
In shape: 86, Out shape 95.
In shape: 95, Out shape 9.
In shape: 9, Out shape 89.
In shape: 89, Out shape 8.
In shape: 8, Out shape 339.
simpnet8m(
  (features): Sequential(
    (0): Conv2d(3, 128, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (1): BatchNorm2d(128, eps=1e-05, momentum=0.05, affine=True)
    (2): ReLU(inplace)
    (3): Dropout2d(p=0.02)
    (4): Conv2d(128, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (5): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (6): ReLU(inplace)
    (7): Dropout2d(p=0.05)
    (8): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (9): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (10): ReLU(inplace)
    (11): Dropout2d(p=0.05)
    (12): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (13): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (14): ReLU(inplace)
    (15): Dropout2d(p=0.05)
    (16): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (17): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (18): ReLU(inplace)
    (19): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), dilation=(1, 1), ceil_mode=False)
    (20): Dropout2d(p=0.05)
    (21): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (22): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (23): ReLU(inplace)
    (24): Dropout2d(p=0.05)
    (25): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (26): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (27): ReLU(inplace)
    (28): Dropout2d(p=0.05)
    (29): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (30): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (31): ReLU(inplace)
    (32): Dropout2d(p=0.05)
    (33): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (34): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
    (35): ReLU(inplace)
    (36): Dropout2d(p=0.05)
    (37): Conv2d(182, 430, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (38): BatchNorm2d(430, eps=1e-05, momentum=0.05, affine=True)
    (39): ReLU(inplace)
    (40): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), dilation=(1, 1), ceil_mode=False)
    (41): Dropout2d(p=0.1)
    (42): Conv2d(430, 430, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (43): BatchNorm2d(430, eps=1e-05, momentum=0.05, affine=True)
    (44): ReLU(inplace)
    (45): Dropout2d(p=0.1)
    (46): Conv2d(430, 455, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (47): BatchNorm2d(455, eps=1e-05, momentum=0.05, affine=True)
    (48): ReLU(inplace)
    (49): Dropout2d(p=0.1)
    (50): Conv2d(455, 600, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
    (51): BatchNorm2d(600, eps=1e-05, momentum=0.05, affine=True)
    (52): ReLU(inplace)
  )
  (classifier): Linear(in_features=600, out_features=10, bias=True)
)
Files already downloaded and verified
Traceback (most recent call last):
  File "vggprune.py", line 213, in <module>
    test(model)
  File "vggprune.py", line 150, in test
    output = model(data)
  File "/media/hossein/0D8910C60D8910C6/pytorch3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in __call__
    result = self.forward(*input, **kwargs)
  File "/media/hossein/tmpstore/Network_slimming/network-slimming/models/simpnet8m.py", line 49, in forward
    out = self.features(x)
  File "/media/hossein/0D8910C60D8910C6/pytorch3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in __call__
    result = self.forward(*input, **kwargs)
  File "/media/hossein/0D8910C60D8910C6/pytorch3/lib/python3.6/site-packages/torch/nn/modules/container.py", line 67, in forward
    input = module(input)
  File "/media/hossein/0D8910C60D8910C6/pytorch3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in __call__
    result = self.forward(*input, **kwargs)
  File "/media/hossein/0D8910C60D8910C6/pytorch3/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 282, in forward
    self.padding, self.dilation, self.groups)
  File "/media/hossein/0D8910C60D8910C6/pytorch3/lib/python3.6/site-packages/torch/nn/functional.py", line 90, in conv2d
    return f(input, weight, bias)
RuntimeError: Given weight of size [9, 3, 3, 3], expected bias to be 1-dimensional with 9 elements, but got bias of size [128] instead

Thanks a lot

Question about zero channels after prunining for VGG

你好，想请问一下对于VGG如果剪枝之后channel数是0是怎么处理的呢？谢谢！

densenet: growthRate & n

network-slimming/models/densenet.py

Line 72 in 0b2f743

cfg.append([start+12*i for i in range(n+1)])

Should the 12 here be replaced with growthRate, and the one in line 73 with n?

ZeroDivisionError: float division by zero

hey Eric. i got this error during pruning , can you suggest me how should i handle it. Thanks

Traceback (most recent call last):
File "vggprune.py", line 128, in
newmodel = vgg(cfg=cfg)
File "/home/ustc/akb/network-slimming/models/vgg.py", line 22, in init
self.feature = self.make_layers(cfg, True)
File "/home/ustc/akb/network-slimming/models/vgg.py", line 39, in make_layers
conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1, bias=False)
File "/home/ustc/anaconda2/lib/python2.7/site-packages/torch/nn/modules/conv.py", line 297, in init
False, _pair(0), groups, bias)
File "/home/ustc/anaconda2/lib/python2.7/site-packages/torch/nn/modules/conv.py", line 38, in init
self.reset_parameters()
File "/home/ustc/anaconda2/lib/python2.7/site-packages/torch/nn/modules/conv.py", line 44, in reset_parameters
stdv = 1. / math.sqrt(n)
ZeroDivisionError: float division by zero

如何进行迭代剪枝

文中所提到的迭代剪枝是将您提供的代码进行一次，然后将得到微调后的模型重复train with sparsity　，prune等过程吗？我实际去做的时候发现剪枝一次后的模型再重复剪枝的话减掉的通道和第一次剪枝一样，并没有剪枝掉更多的通道，这是什么原因呢？

为什么剪枝生成的tar模型压缩包打不开啊？无法解压，显示损坏

如题，不知有人遇到这种情况吗？
我剪枝后生成的模型tar压缩包打不开，显示损坏

invalid syntax

main.py
line 150
def test():
SyntaxError:Invalid Syntax

关于channel selection layer的问题请教

您好，请问为什么我们需要channel selection layer来辅助ResNet和DenseNet的剪枝呀?

我看代码，自己的理解是对于ResNet和DenseNet在BN层后面添加了channel selection layer，然后进行训练。在模型裁剪的时候，channel selection layer的值全置0，然后将需要保留的赋值为1.

    # We need to set the channel selection layer.
    m2 = new_modules[layer_id + 1]
    m2.indexes.data.zero_()
    m2.indexes.data[idx1.tolist()] = 1.0

感觉这里也是对于我们增加的channel selection layer中需要裁剪的给裁剪了，保留未裁剪的。但我感觉如果我不加这个通道选择层。一样的如同vgg的裁剪方式，好像也没什么问题。

可能我对代码理解得不够透彻，希望作者您能指点一二，谢谢，期待您的回复！

是否可以提供剪枝好的模型？

您好，请问可否提供cifar 100和100上vgg 19, resnet 164和densenet 40的预训练模型或者剪枝好的模型？谢谢!

Training with Sparsity

This process can be fine-tuned with a "Normal Trained" model?

I prune InceptionV3 by fine-tuning with a pre-trained ImageNet model, and find it slow to be sparse(base_lr=0.01, 30epoch). I wonder whether if it is possible to use fine-tuning.

And how do you think InceptionV3 which has many branchs? I find it's a trouble to implement InceptionV3prune.py.

code error: > invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number

code error:

invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number

fix:
replace .data[0] in main.py file function train() and test() with .item()

What if the case that gamma is near zero, but beta is very large?

There is only consideration of l1 regularization of gamma, but how about beta? If gamma is near zero but beta is large, should we prune that channel?

How can finetune pruned models?

Hi @Eric-mingjie,

Thanks for your great implementation. I want to finetune my pruned network but in your main.py, the error show that there is no flag: --refine. Could you show me how to use that?

Thanks,
Hai

densenet网络结构

大佬,请教一下,densenet代码里,每个denseblock各个denselayer的输出的特征图,为啥没有作为同一个block里其他denselayer的输入呢?

def _make_denseblock(self, block, blocks, cfg):
        layers = []
        assert blocks == len(cfg), 'Length of the cfg parameter is not right.'
        for i in range(blocks):
            # Currently we fix the expansion ratio as the default value
            layers.append(block(self.inplanes, cfg=cfg[i], growthRate=self.growthRate, dropRate=self.dropRate))
            self.inplanes += self.growthRate

        return nn.Sequential(*layers)

这是torchvision里面的denseblock:

class _DenseBlock(nn.ModuleDict):
    _version = 2

    def __init__(self, num_layers, num_input_features, bn_size, growth_rate, drop_rate, memory_efficient=False):
        super(_DenseBlock, self).__init__()
        for i in range(num_layers):
            layer = _DenseLayer(
                num_input_features + i * growth_rate,
                growth_rate=growth_rate,
                bn_size=bn_size,
                drop_rate=drop_rate,
                memory_efficient=memory_efficient,
            )
            self.add_module('denselayer%d' % (i + 1), layer)

    def forward(self, init_features):
        features = [init_features]
        for name, layer in self.items():
            new_features = layer(features)
            features.append(new_features)
        return torch.cat(features, 1)

模型训练结果有较大的不同

作者您好：
我使用您的代码在cifar10和100dataset上分别训练了vgg16、19，resnet56、164。但无论是baseline和train w\ sparsity的结果都和您给出的结果有较大的gap。具体结果如下，想问一下您造成这样的原因：

mask-impl fine-tune

你好，mask版本的prune_mask并没有真正对权重进行剪枝，在finetune时，虽然用的是mask后的权重，但是对所有parameters进行更新，这样bn那些为0的权重不是也被训练了吗？

剪枝后的网络参数量如何统计

您好！：
我想请问一下，对于剪枝过后的网络，您是用何种方法统计其参数量的呢？因为按照我个人的理解，代码其实是把一些不重要的权重置零？
谢谢！

Resnet模型

您好：想请问是否有剪枝后的resnet模型呢？可以提供模型嘛？非常感谢！

vgg_prune.py

hey Eric. i got this error during pruning , can you suggest me how should i handle it. Thanks

Traceback (most recent call last):
File "vggprune.py", line 73, in
mask = weight_copy.abs().gt(thre).float()
RuntimeError: Expected object of backend CUDA but got backend CPU for argument #2 'other'

how Pruning the last conv layer affects the first linear layer of the classifier

I trained the vgg and saved the model as pth file. then I load it for pruning some filters of it.
the last conv after pruning is not 512 anymore, some filters are gone.
how Pruning the last conv layer affects the first linear layer of the classifier which is (512 7 7, 4096).
how can I prune the input weights of classifier according to the last conv layer.

Why not stop gradient for channel_selection layer's parameters?

class channel_selection(nn.Module):
   def __init__(self, num_channels):
       
         super(channel_selection, self).__init__()

         self.indexes = nn.Parameter(torch.ones(num_channels))

should this be :
self.indexes = nn.Parameter(torch.ones(num_channels), requires_grad=False)?

BN weight not in model.parameters()

network-slimming/main.py

Line 126 in 98e6b4d

def updateBN():

Hello, thank you for sharing the code.

I tried to reimplement your approach but found that the weights of nn.batchnorm2d are not in model.parameters() so the optimizer won't update them. Also, the function updateBN() doesn't work as in "m.weight.grad.data.add_(...)" weight.grad is NoneType.

Could you share how you resolved this or I missed something? Thanks!

How to set the number of channels for convolution layer input after Channel-level Sparsity

Hi! After channel_selection is invoked,the places in indexes which correpond to the channels to be pruned will be set to 0. So,the number of channels of feature map changes. After pruning, how do you determine the number of channels for feature map?

Resnet中bn和conv的位置关系问题？

您好，非常感谢您的开源代码！
想请教您一个问题，您在设计resnet的bottleneck的时候，将bn放在conv之前是基于什么考虑呢？
因为我将您的resnet剪枝思路应用到检测模型时，我使用的是bn放在conv之后的bottleneck，不知道这样和前者剪枝的差别大不大？

Wonder why didn't implement multi-GPU training?

I tried to modify code to implement Multi-GPU, but find it couldn't work.
So I am curious why didn't you implement multi-gpu training?
Thank you in advance!

关于剪枝之后结构的调整

你好，非常感谢你的工作，我有个问题，对于network-slimming来讲，剪枝之后重新调整结构，使其更加轻量化，是否可行？

剪枝方面的一个小问题

你好，感谢你的复现工作，有个小问题，在你mask的实现方法中，你对小于阈值的BN层部分进行了mask，但是并没有真正将其连接打断，这是否并不会导致最终模型权重的减小？期待你的解答

Question about the weight in nn.Linear

network-slimming/vggprune.py

Lines 161 to 166 in 98e6b4d

    
           elif isinstance(m0, nn.Linear): 
        
               idx0 = np.squeeze(np.argwhere(np.asarray(start_mask.cpu().numpy()))) 
        
               if idx0.size == 1: 
        
                   idx0 = np.resize(idx0, (1,)) 
        
               m1.weight.data = m0.weight.data[:, idx0].clone() 
        
               m1.bias.data = m0.bias.data.clone()

I think these is something wrong with the weights in the nn.Linear. nn.Linear is applied after flattening the output of the last conv2d, therefore i think the index shouldn't be idx0, the shape of m0.weight is to do with the output shape of the last conv2d.

Cannot find L1 loss in training code

请教下100分类中稀疏度的值

大佬请问下，在10分类中vgg，resnet，densenet所对应的稀疏化系数0.0001 0.00001，0.00001，那么在100分类稀疏化系数是怎样的？

About the resprune.py

`if conv_count % 3 != 1:
w1 = w1[idx1.tolist(), :, :, :].clone() # ？？？

I have some question about the code pasted above, why not is the below?
w1 = w1[:,idx1.tolist(), :, :].clone()

对稀疏操作有点疑惑。

请问下，为什么只在反向的时候对bn的scale做L1的梯度操作，然而并没有像论文提到的公式在loss有所体现？
这里有点不能理解，麻烦解惑下，谢谢！

RuntimeError: CUDA error: device-side assert triggered

/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [10,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [11,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [12,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [23,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [24,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [25,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [64,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [75,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [76,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [77,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [88,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [89,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [90,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [101,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [102,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [103,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [114,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [115,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [116,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [127,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [36,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [37,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [38,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [49,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [50,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [51,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [62,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [63,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
Traceback (most recent call last):
File "resprune.py", line 224, in
main()
File "resprune.py", line 203, in main
w1 = w1[idx1.tolist(), :, :, :].clone()
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [10,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [11,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [12,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [23,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [24,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [25,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [64,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [75,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [76,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [77,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [88,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [89,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [90,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [101,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [102,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [103,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [114,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [115,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [116,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [127,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [36,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [37,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [38,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [49,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [50,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [51,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [62,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [63,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
Traceback (most recent call last):
File "resprune.py", line 224, in
main()
File "resprune.py", line 203, in main
w1 = w1[idx1.tolist(), :, :, :].clone()
File "resprune.py", line 224, in
main()
File "resprune.py", line 203, in main
w1 = w1[idx1.tolist(), :, :, :].clone()
RuntimeError: CUDA error: device-side assert triggered

	elif isinstance(m0, nn.Linear):
	idx0 = np.squeeze(np.argwhere(np.asarray(start_mask.cpu().numpy())))
	if idx0.size == 1:
	idx0 = np.resize(idx0, (1,))
	m1.weight.data = m0.weight.data[:, idx0].clone()
	m1.bias.data = m0.bias.data.clone()

eric-mingjie / network-slimming Goto Github PK

network-slimming's People

Contributors

Stargazers

Watchers

Forkers

network-slimming's Issues

Recommend Projects

Recommend Topics

Recommend Org