eric-mingjie / network-slimming Goto Github PK
View Code? Open in Web Editor NEWNetwork Slimming (Pytorch) (ICCV 2017)
License: MIT License
Network Slimming (Pytorch) (ICCV 2017)
License: MIT License
https://arxiv.org/pdf/1708.06519.pdf
How do you calculate table 1 FLOPs?
I use https://github.com/Lyken17/pytorch-OpCounter to calculate baseline vggnet (vgg-19) flops, I got 399718400.0
However Network Slimming paper says 7.97×10e8 -- almost double the flops
If I apply this flops counter to vgg-16, I got matching result with layerwise flops from
L1 Pruning ConvNet paper https://arxiv.org/abs/1608.08710
After training, is there any ckpt file? If there is , plz give me the dir .
Thanks for sharing!
I have trained a preresnet-164 use python main.py -sr --s 0.00001 --dataset cifar10 --arch resnet --depth 164
, and I want to prune the model with python resprune.py --dataset cifar10 --depth 164 --percent 0.4 --model [My model path] --save [My path]
, But it has some errors and I don't know how to solve.
`Test set: Accuracy: 9262/10000 (92.6%)
Cfg:
[10, 15, 16, 24, 15, 16, 7, 9, 11, 17, 15, 15, 23, 15, 15, 8, 12, 15, 19, 15, 16, 20, 15, 15, 13, 12, 16, 21, 15, 16, 14, 15, 15, 18, 13, 16, 17, 12, 15, 13, 13, 15, 1, 4, 7, 9, 12, 16, 18, 12, 15, 25, 15, 16, 35, 32, 32, 38, 31, 32, 47, 32, 32, 44, 31, 32, 51, 30, 32, 43, 31, 32, 38, 31, 32, 43, 32, 32, 54, 32, 32, 70, 32, 32, 51, 32, 32, 52, 32, 32, 51, 31, 32, 52, 32, 32, 52, 30, 32, 57, 32, 32, 51, 30, 32, 61, 31, 32, 125, 64, 64, 59, 60, 64, 68, 60, 64, 66, 63, 64, 78, 62, 64, 89, 64, 64, 106, 63, 64, 119, 64, 64, 123, 64, 64, 149, 64, 64, 135, 64, 64, 144, 64, 63, 127, 64, 63, 139, 64, 63, 144, 64, 64, 135, 64, 64, 141, 63, 64, 141, 64, 63, 85]
Traceback (most recent call last):
File "resprune.py", line 181, in
m1.weight.data = m0.weight.data.clone()
File "/home/ubuntu/anaconda3/envs/YOLACT/lib/python3.6/site-packages/torch/nn/modules/module.py", line 585, in getattr
type(self).name, name))
AttributeError: 'Sequential' object has no attribute 'weight'`
Can someone give me some suggestions?
Thanks very much!
大神,你好。我想问一个问题,我觉的论文《Learning Efficient Convolutional Networks Through Network Slimming》中给出的损失函数是针对需要剪枝的BN层的,而网络的最后层的损失函数还是经典的yolov3的损失函数,可以这样理解吗?根据代码的意思,最后的loss依然是经典的yolov3的损失函数值,没有加入L1正则的损失值。如果在网络的最后的损失函数是论文中的公式话,那么应该对每层的反向梯度就都应该包含L1正则的梯度
期待您的回复。十分感谢
Traceback (most recent call last):
File "vggprune.py", line 73, in
mask = weight_copy.gt(thre).float().cuda()
RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.FloatTensor for argument #2 'other'
我用的环境是python3.6 torch0.4.1
Thank you for your sharing. I have a question about sparsity regulization.
As is shown in the formula (1) in the paper, g(s) is added in the loss function. But in the code, I don't find g(s) in the loss funtion. I only find the additional gradient about scaling factor is added to original gradient.
Could you show me where you add the g(s) to the loss function in the code?
what the use of channel selection layer for resnet?
when i run the file of vggprune.py, there's a fault: 'unexpected key "module.feature.0.weight" in state_dict'~
i think that's the model's name is wrong. what i should do?
你好,我训练完vgg16后发现 所有 gamma的值大小从0.1-0.8,好像并不稀疏
satrt_time = time.time()
output = model(data)
end_time = time.time()
total_time+=(end_time-satrt_time)
为什么我这样统计,剪枝之后inference速度和不剪枝没说明区别,并没有明显提高?
code error:
invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number
fix:
replace .data[0]
in main.py function train() and test() with .item()
您好,请问一下再剪枝的时候,为什么对bn的权重取了绝对值?
hey Eric. i got this error during pruning , can you suggest me how should i handle it. Thanks
Traceback (most recent call last):
File "vggprune.py", line 123, in
newmodel = vgg(dataset=args.dataset, cfg=cfg)
File "/data/hzm/network-slimming-master/models/vgg.py", line 22, in init
self.feature = self.make_layers(cfg, True)
File "/data/hzm/network-slimming-master/models/vgg.py", line 39, in make_layers
conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1, bias=False)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/conv.py", line 315, in init
False, pair(0), groups, bias)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/conv.py", line 43, in init
self.reset_parameters()
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/conv.py", line 47, in reset_parameters
init.kaiming_uniform(self.weight, a=math.sqrt(5))
File "/usr/local/lib/python3.5/dist-packages/torch/nn/init.py", line 288, in kaiming_uniform_
fan = _calculate_correct_fan(tensor, mode)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/init.py", line 257, in _calculate_correct_fan
fan_in, fan_out = _calculate_fan_in_and_fan_out(tensor)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/init.py", line 191, in _calculate_fan_in_and_fan_out
receptive_field_size = tensor[0][0].numel()
IndexError: index 0 is out of bounds for dimension 0 with size 0
Hi Eric,
Since I need to train a ResNet50 with less parameter, i read the code of definition of ResNet in your code. I found that it is pretty different from official code of Pytorch torchvision model. Could you please tell how i can change the code so i can train a pruned ResNet50?
why the test of the resnet164 model is slow?Its flops and parameters are smaller than resnet18(official model of Pytorch for cifar10),but it is slower。。。
because of channel_selection?
Thanks!
@Eric-mingjie : First of all thank you very much. When I try to prune my architecture(Simpnet) , half way through the pruning it crashes with the error :
RuntimeError: Given weight of size [9, 3, 3, 3], expected bias to be 1-dimensional with 9 elements, but got bias of size [128] instead
What is wrong here? Can you kindly assist me in resolving this issue ?
Here is the whole log :
=> loading checkpoint 'model_best_simpnet8.pth.tar'
=> loaded checkpoint 'model_best_simpnet8.pth.tar' (epoch 101) Prec1: 96.120000
simpnet8m(
(features): Sequential(
(0): Conv2d(3, 128, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(1): BatchNorm2d(128, eps=1e-05, momentum=0.05, affine=True)
(2): ReLU(inplace)
(3): Dropout2d(p=0.02)
(4): Conv2d(128, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(5): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
(6): ReLU(inplace)
(7): Dropout2d(p=0.05)
(8): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(9): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
(10): ReLU(inplace)
(11): Dropout2d(p=0.05)
(12): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(13): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
(14): ReLU(inplace)
(15): Dropout2d(p=0.05)
(16): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(17): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
(18): ReLU(inplace)
(19): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), dilation=(1, 1), ceil_mode=False)
(20): Dropout2d(p=0.05)
(21): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(22): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
(23): ReLU(inplace)
(24): Dropout2d(p=0.05)
(25): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(26): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
(27): ReLU(inplace)
(28): Dropout2d(p=0.05)
(29): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(30): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
(31): ReLU(inplace)
(32): Dropout2d(p=0.05)
(33): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(34): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
(35): ReLU(inplace)
(36): Dropout2d(p=0.05)
(37): Conv2d(182, 430, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(38): BatchNorm2d(430, eps=1e-05, momentum=0.05, affine=True)
(39): ReLU(inplace)
(40): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), dilation=(1, 1), ceil_mode=False)
(41): Dropout2d(p=0.1)
(42): Conv2d(430, 430, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(43): BatchNorm2d(430, eps=1e-05, momentum=0.05, affine=True)
(44): ReLU(inplace)
(45): Dropout2d(p=0.1)
(46): Conv2d(430, 455, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(47): BatchNorm2d(455, eps=1e-05, momentum=0.05, affine=True)
(48): ReLU(inplace)
(49): Dropout2d(p=0.1)
(50): Conv2d(455, 600, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(51): BatchNorm2d(600, eps=1e-05, momentum=0.05, affine=True)
(52): ReLU(inplace)
)
(classifier): Linear(in_features=600, out_features=10, bias=True)
)
layer index: 3 total channel: 128 remaining channel: 9
layer index: 7 total channel: 182 remaining channel: 17
layer index: 11 total channel: 182 remaining channel: 44
layer index: 15 total channel: 182 remaining channel: 34
layer index: 19 total channel: 182 remaining channel: 89
layer index: 24 total channel: 182 remaining channel: 128
layer index: 28 total channel: 182 remaining channel: 102
layer index: 32 total channel: 182 remaining channel: 86
layer index: 36 total channel: 182 remaining channel: 95
layer index: 40 total channel: 430 remaining channel: 9
layer index: 45 total channel: 430 remaining channel: 89
layer index: 49 total channel: 455 remaining channel: 8
layer index: 53 total channel: 600 remaining channel: 339
Pre-processing Successful!
Files already downloaded and verified
Test set: Accuracy: 1000/10000 (10.0%)
[9, 17, 44, 34, 89, 'M', 128, 102, 86, 95, 9, 'M', 89, 8, 339]
In shape: 3, Out shape 9.
In shape: 9, Out shape 17.
In shape: 17, Out shape 44.
In shape: 44, Out shape 34.
In shape: 34, Out shape 89.
In shape: 89, Out shape 128.
In shape: 128, Out shape 102.
In shape: 102, Out shape 86.
In shape: 86, Out shape 95.
In shape: 95, Out shape 9.
In shape: 9, Out shape 89.
In shape: 89, Out shape 8.
In shape: 8, Out shape 339.
simpnet8m(
(features): Sequential(
(0): Conv2d(3, 128, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(1): BatchNorm2d(128, eps=1e-05, momentum=0.05, affine=True)
(2): ReLU(inplace)
(3): Dropout2d(p=0.02)
(4): Conv2d(128, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(5): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
(6): ReLU(inplace)
(7): Dropout2d(p=0.05)
(8): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(9): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
(10): ReLU(inplace)
(11): Dropout2d(p=0.05)
(12): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(13): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
(14): ReLU(inplace)
(15): Dropout2d(p=0.05)
(16): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(17): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
(18): ReLU(inplace)
(19): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), dilation=(1, 1), ceil_mode=False)
(20): Dropout2d(p=0.05)
(21): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(22): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
(23): ReLU(inplace)
(24): Dropout2d(p=0.05)
(25): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(26): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
(27): ReLU(inplace)
(28): Dropout2d(p=0.05)
(29): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(30): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
(31): ReLU(inplace)
(32): Dropout2d(p=0.05)
(33): Conv2d(182, 182, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(34): BatchNorm2d(182, eps=1e-05, momentum=0.05, affine=True)
(35): ReLU(inplace)
(36): Dropout2d(p=0.05)
(37): Conv2d(182, 430, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(38): BatchNorm2d(430, eps=1e-05, momentum=0.05, affine=True)
(39): ReLU(inplace)
(40): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), dilation=(1, 1), ceil_mode=False)
(41): Dropout2d(p=0.1)
(42): Conv2d(430, 430, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(43): BatchNorm2d(430, eps=1e-05, momentum=0.05, affine=True)
(44): ReLU(inplace)
(45): Dropout2d(p=0.1)
(46): Conv2d(430, 455, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(47): BatchNorm2d(455, eps=1e-05, momentum=0.05, affine=True)
(48): ReLU(inplace)
(49): Dropout2d(p=0.1)
(50): Conv2d(455, 600, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1))
(51): BatchNorm2d(600, eps=1e-05, momentum=0.05, affine=True)
(52): ReLU(inplace)
)
(classifier): Linear(in_features=600, out_features=10, bias=True)
)
Files already downloaded and verified
Traceback (most recent call last):
File "vggprune.py", line 213, in <module>
test(model)
File "vggprune.py", line 150, in test
output = model(data)
File "/media/hossein/0D8910C60D8910C6/pytorch3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in __call__
result = self.forward(*input, **kwargs)
File "/media/hossein/tmpstore/Network_slimming/network-slimming/models/simpnet8m.py", line 49, in forward
out = self.features(x)
File "/media/hossein/0D8910C60D8910C6/pytorch3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in __call__
result = self.forward(*input, **kwargs)
File "/media/hossein/0D8910C60D8910C6/pytorch3/lib/python3.6/site-packages/torch/nn/modules/container.py", line 67, in forward
input = module(input)
File "/media/hossein/0D8910C60D8910C6/pytorch3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in __call__
result = self.forward(*input, **kwargs)
File "/media/hossein/0D8910C60D8910C6/pytorch3/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 282, in forward
self.padding, self.dilation, self.groups)
File "/media/hossein/0D8910C60D8910C6/pytorch3/lib/python3.6/site-packages/torch/nn/functional.py", line 90, in conv2d
return f(input, weight, bias)
RuntimeError: Given weight of size [9, 3, 3, 3], expected bias to be 1-dimensional with 9 elements, but got bias of size [128] instead
Thanks a lot
你好,想请问一下对于VGG如果剪枝之后channel数是0是怎么处理的呢?谢谢!
network-slimming/models/densenet.py
Line 72 in 0b2f743
Should the 12 here be replaced with growthRate, and the one in line 73 with n?
hey Eric. i got this error during pruning , can you suggest me how should i handle it. Thanks
Traceback (most recent call last):
File "vggprune.py", line 128, in
newmodel = vgg(cfg=cfg)
File "/home/ustc/akb/network-slimming/models/vgg.py", line 22, in init
self.feature = self.make_layers(cfg, True)
File "/home/ustc/akb/network-slimming/models/vgg.py", line 39, in make_layers
conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1, bias=False)
File "/home/ustc/anaconda2/lib/python2.7/site-packages/torch/nn/modules/conv.py", line 297, in init
False, _pair(0), groups, bias)
File "/home/ustc/anaconda2/lib/python2.7/site-packages/torch/nn/modules/conv.py", line 38, in init
self.reset_parameters()
File "/home/ustc/anaconda2/lib/python2.7/site-packages/torch/nn/modules/conv.py", line 44, in reset_parameters
stdv = 1. / math.sqrt(n)
ZeroDivisionError: float division by zero
文中所提到的迭代剪枝是将您提供的代码进行一次,然后将得到微调后的模型重复train with sparsity ,prune等过程吗?我实际去做的时候发现剪枝一次后的模型再重复剪枝的话减掉的通道和第一次剪枝一样,并没有剪枝掉更多的通道,这是什么原因呢?
main.py
line 150
def test():
SyntaxError:Invalid Syntax
您好,请问为什么我们需要channel selection layer来辅助ResNet和DenseNet的剪枝呀?
我看代码,自己的理解是对于ResNet和DenseNet在BN层后面添加了channel selection layer,然后进行训练。在模型裁剪的时候,channel selection layer的值全置0,然后将需要保留的赋值为1.
# We need to set the channel selection layer.
m2 = new_modules[layer_id + 1]
m2.indexes.data.zero_()
m2.indexes.data[idx1.tolist()] = 1.0
感觉这里也是对于我们增加的channel selection layer中需要裁剪的给裁剪了,保留未裁剪的。但我感觉如果我不加这个通道选择层。一样的如同vgg的裁剪方式,好像也没什么问题。
可能我对代码理解得不够透彻,希望作者您能指点一二,谢谢,期待您的回复!
您好,请问可否提供cifar 100和100上vgg 19, resnet 164和densenet 40的预训练模型或者剪枝好的模型?谢谢!
This process can be fine-tuned with a "Normal Trained" model?
I prune InceptionV3 by fine-tuning with a pre-trained ImageNet model, and find it slow to be sparse(base_lr=0.01, 30epoch). I wonder whether if it is possible to use fine-tuning.
And how do you think InceptionV3 which has many branchs? I find it's a trouble to implement InceptionV3prune.py.
code error:
invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number
fix:
replace .data[0]
in main.py file function train() and test() with .item()
There is only consideration of l1 regularization of gamma, but how about beta? If gamma is near zero but beta is large, should we prune that channel?
Hi @Eric-mingjie,
Thanks for your great implementation. I want to finetune my pruned network but in your main.py, the error show that there is no flag: --refine. Could you show me how to use that?
Thanks,
Hai
大佬,请教一下,densenet代码里,每个denseblock各个denselayer的输出的特征图,为啥没有作为同一个block里其他denselayer的输入呢?
def _make_denseblock(self, block, blocks, cfg):
layers = []
assert blocks == len(cfg), 'Length of the cfg parameter is not right.'
for i in range(blocks):
# Currently we fix the expansion ratio as the default value
layers.append(block(self.inplanes, cfg=cfg[i], growthRate=self.growthRate, dropRate=self.dropRate))
self.inplanes += self.growthRate
return nn.Sequential(*layers)
这是torchvision里面的denseblock:
class _DenseBlock(nn.ModuleDict):
_version = 2
def __init__(self, num_layers, num_input_features, bn_size, growth_rate, drop_rate, memory_efficient=False):
super(_DenseBlock, self).__init__()
for i in range(num_layers):
layer = _DenseLayer(
num_input_features + i * growth_rate,
growth_rate=growth_rate,
bn_size=bn_size,
drop_rate=drop_rate,
memory_efficient=memory_efficient,
)
self.add_module('denselayer%d' % (i + 1), layer)
def forward(self, init_features):
features = [init_features]
for name, layer in self.items():
new_features = layer(features)
features.append(new_features)
return torch.cat(features, 1)
你好,mask版本的prune_mask并没有真正对权重进行剪枝,在finetune时,虽然用的是mask后的权重,但是对所有parameters进行更新,这样bn那些为0的权重不是也被训练了吗?
您好!:
我想请问一下,对于剪枝过后的网络,您是用何种方法统计其参数量的呢?因为按照我个人的理解,代码其实是把一些不重要的权重置零?
谢谢!
您好:想请问是否有剪枝后的resnet模型呢?可以提供模型嘛?非常感谢!
hey Eric. i got this error during pruning , can you suggest me how should i handle it. Thanks
Traceback (most recent call last):
File "vggprune.py", line 73, in
mask = weight_copy.abs().gt(thre).float()
RuntimeError: Expected object of backend CUDA but got backend CPU for argument #2 'other'
I trained the vgg and saved the model as pth file. then I load it for pruning some filters of it.
the last conv after pruning is not 512 anymore, some filters are gone.
how Pruning the last conv layer affects the first linear layer of the classifier which is (512 7 7, 4096).
how can I prune the input weights of classifier according to the last conv layer.
class channel_selection(nn.Module):
def __init__(self, num_channels):
super(channel_selection, self).__init__()
self.indexes = nn.Parameter(torch.ones(num_channels))
should this be :
self.indexes = nn.Parameter(torch.ones(num_channels), requires_grad=False)
?
Line 126 in 98e6b4d
Hello, thank you for sharing the code.
I tried to reimplement your approach but found that the weights of nn.batchnorm2d are not in model.parameters() so the optimizer won't update them. Also, the function updateBN() doesn't work as in "m.weight.grad.data.add_(...)" weight.grad is NoneType.
Could you share how you resolved this or I missed something? Thanks!
Hi! After channel_selection is invoked,the places in indexes
which correpond to the channels to be pruned will be set to 0. So,the number of channels of feature map changes. After pruning, how do you determine the number of channels for feature map?
您好,非常感谢您的开源代码!
想请教您一个问题,您在设计resnet的bottleneck的时候,将bn放在conv之前是基于什么考虑呢?
因为我将您的resnet剪枝思路应用到检测模型时,我使用的是bn放在conv之后的bottleneck,不知道这样和前者剪枝的差别大不大?
I tried to modify code to implement Multi-GPU, but find it couldn't work.
So I am curious why didn't you implement multi-gpu training?
Thank you in advance!
你好,非常感谢你的工作,我有个问题,对于network-slimming来讲,剪枝之后重新调整结构,使其更加轻量化,是否可行?
你好,感谢你的复现工作,有个小问题,在你mask的实现方法中,你对小于阈值的BN层部分进行了mask,但是并没有真正将其连接打断,这是否并不会导致最终模型权重的减小?期待你的解答
Lines 161 to 166 in 98e6b4d
I think these is something wrong with the weights in the nn.Linear. nn.Linear is applied after flattening the output of the last conv2d, therefore i think the index shouldn't be idx0, the shape of m0.weight is to do with the output shape of the last conv2d.
大佬请问下,在10分类中vgg,resnet,densenet所对应的稀疏化系数0.0001 0.00001,0.00001,那么在100分类稀疏化系数是怎样的?
`if conv_count % 3 != 1:
w1 = w1[idx1.tolist(), :, :, :].clone() # ???
`
I have some question about the code pasted above, why not is the below?
w1 = w1[:,idx1.tolist(), :, :].clone()
请问下,为什么只在反向的时候对bn的scale做L1的梯度操作,然而并没有像论文提到的公式在loss有所体现?
这里有点不能理解,麻烦解惑下,谢谢!
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [10,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [11,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [12,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [23,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [24,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [25,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [64,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [75,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [76,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [77,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [88,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [89,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [90,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [101,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [102,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [103,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [114,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [115,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [116,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [127,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [36,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [37,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [38,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [49,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [50,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [51,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [62,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [63,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
Traceback (most recent call last):
File "resprune.py", line 224, in
main()
File "resprune.py", line 203, in main
w1 = w1[idx1.tolist(), :, :, :].clone()
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [10,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [11,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [12,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [23,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [24,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [25,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [64,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [75,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [76,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [77,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [88,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [89,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [90,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [101,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [102,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [103,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [114,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [115,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [116,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [127,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [36,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [37,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [38,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [49,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [50,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [51,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [62,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:53: lambda ->auto::operator()(int)->auto: block: [0,0,0], thread: [63,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
Traceback (most recent call last):
File "resprune.py", line 224, in
main()
File "resprune.py", line 203, in main
w1 = w1[idx1.tolist(), :, :, :].clone()
File "resprune.py", line 224, in
main()
File "resprune.py", line 203, in main
w1 = w1[idx1.tolist(), :, :, :].clone()
RuntimeError: CUDA error: device-side assert triggered
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.