Code Monkey home page Code Monkey logo

Comments (22)

jkhu29 avatar jkhu29 commented on May 23, 2024 3

可能是没有下载tensorboard,见 https://github.com/fangwei123456/spikingjelly/blob/0.0.0.0.12/spikingjelly/clock_driven/cu_kernel_opt.py#L9C1-L10C1、
在0.0.0.0.12版本下import cupy的同时需要import tensorboard,0.0.0.0.14似乎修复了这个问题

from spikingformer.

zhouchenlin2096 avatar zhouchenlin2096 commented on May 23, 2024
  1. What is the error during the installation of "cupy"?
  2. please check your torch / cuda version & spikingjelly version?
  3. try the "torch" backend in replacement of "cupy", which may cause the training to be slower.

from spikingformer.

qian26 avatar qian26 commented on May 23, 2024

我也遇到了上面的问题,无论怎样安装CuPy,无论是pip还是conda,运行代码时都会提示“CuPy is not installed! You can install it from "https://github.com/cupy/cupy.”
但我单独新建问题进行import cupy进行测试是没有问题的,请问该怎么解决?

from spikingformer.

yult0821 avatar yult0821 commented on May 23, 2024

You can refer to a similar issue in SpikingJelly which may help to solve this problem. fangwei123456/spikingjelly#243

from spikingformer.

qian26 avatar qian26 commented on May 23, 2024

Thanks. I have already referred to the web page you provided.

"conda list torch" and "conda list cupy" are fine, and "import cupy" is fine, but the code still says 'CuPy is not installed! You can install it from "https://github.com/cupy/cupy".'

I'm now using "torch" as the backbend, which is much slower, and it took me nearly 12 hours to run 70 Epochs, My code set 400 epochs. I think my computer will probably be tired to death (sad (god bless my computer and me

from spikingformer.

Castrol68 avatar Castrol68 commented on May 23, 2024

我也遇到了这个问题,请问有解决方法么

from spikingformer.

fangwei123456 avatar fangwei123456 commented on May 23, 2024

装最新版框架试试?@Castrol68

from spikingformer.

touristourist avatar touristourist commented on May 23, 2024

装最新版框架试试?@Castrol68

Hi, I've got a question about the CuPy acceleration impact on Spikingformer. When using the torch backend, each iter takes ~0.71 secs, while with CuPy, it takes 0.57 secs, resulting in ~20% reduction in training time.

I wonder whether this acceleration ratio is normal, considering that the acceleration impact of CuPy in the tutorial is extremely significant.

from spikingformer.

fangwei123456 avatar fangwei123456 commented on May 23, 2024

@touristourist Hi, it depends on T, the number of time-steps. You can try different T. If you use a small T, then the acceleration ratio is also small.

from spikingformer.

touristourist avatar touristourist commented on May 23, 2024

@touristourist Hi, it depends on T, the number of time-steps. You can try different T. If you use a small T, then the acceleration ratio is also small.

Alright, got it! So, I understand that different timesteps do have an impact on the acceleration ratio. However, the chart in the tutorial shows that when the timestep is 8, the forward and backward processes can speed up by approximately 5 times (8.13/1.65), which is quite a significant difference compared to what I'm experiencing (only a 20% time reduction). Could it be that the chart specifically shows the acceleration effect only on neurons, while operations like convolution and linear cannot be accelerated with CuPy? As a newcomer to spikingjelly, I'm eagerly awaiting your response. Thanks!

from spikingformer.

fangwei123456 avatar fangwei123456 commented on May 23, 2024

Could it be that the chart specifically shows the acceleration effect only on neurons, while operations like convolution and linear cannot be accelerated with CuPy.

Yes, and the acceleration ratio is smaller than a single neuron layer.

from spikingformer.

ShaopengLu avatar ShaopengLu commented on May 23, 2024

我也遇到了上面的问题,无论怎样安装CuPy,无论是pip还是conda,运行代码时都会提示“CuPy is not installed! You can install it from "https://github.com/cupy/cupy.” 但我单独新建问题进行import cupy进行测试是没有问题的,请问该怎么解决?

你好,我也遇到了同样的问题,请问这个问题你解决了没有

from spikingformer.

fangwei123456 avatar fangwei123456 commented on May 23, 2024

试试jkhu29的评论中的解决方法?

from spikingformer.

ShaopengLu avatar ShaopengLu commented on May 23, 2024

fangwei123456

Traceback (most recent call last):
File "D:\Postgraduate\code\Spikingformer-master\imagenet\model.py", line 261, in
model = create_model(
File "F:\anaconda\envs\pytorch\lib\site-packages\timm\models\factory.py", line 71, in create_model
model = create_fn(pretrained=pretrained, pretrained_cfg=pretrained_cfg, **kwargs)
File "D:\Postgraduate\code\Spikingformer-master\imagenet\model.py", line 251, in Spikingformer
model = vit_snn(
File "D:\Postgraduate\code\Spikingformer-master\imagenet\model.py", line 194, in init
patch_embed = SpikingTokenizer(img_size_h=img_size_h,
File "D:\Postgraduate\code\Spikingformer-master\imagenet\model.py", line 132, in init
self.proj1_lif = MultiStepLIFNode(tau=2.0, detach_reset=True, backend='cupy')
File "F:\anaconda\envs\pytorch\lib\site-packages\spikingjelly\clock_driven\neuron.py", line 823, in init
check_backend(backend)
File "F:\anaconda\envs\pytorch\lib\site-packages\spikingjelly\clock_driven\neuron.py", line 30, in check_backend
assert cupy is not None, 'CuPy is not installed! You can install it from "https://github.com/cupy/cupy".'
AssertionError: CuPy is not installed! You can install it from "https://github.com/cupy/cupy".
你好,我尝试了还是报错。我看评论中有提到可以不用cupy,那我应该怎么进行呢?

from spikingformer.

fangwei123456 avatar fangwei123456 commented on May 23, 2024

不用cupy的话就把神经元的后端设置成torch

from spikingformer.

ShaopengLu avatar ShaopengLu commented on May 23, 2024

不用cupy的话就把神经元的后端设置成torch

你好,请问是把这些都改为torch吗? self.mlp1_lif = MultiStepLIFNode(tau=2.0, detach_reset=True, backend='cupy')就是类似于这种的改一下吗?还是怎么改?

from spikingformer.

fangwei123456 avatar fangwei123456 commented on May 23, 2024

是的,所有神经元设置 backend='torch'

from spikingformer.

ShaopengLu avatar ShaopengLu commented on May 23, 2024

是的,所有神经元设置 backend='torch'

你好,刚刚修改完是可以运行了。但是我在跑test.py时,遇到了这样的问题,请问这个你知道怎么解决吗?
INFO:train:Training with a single process on 1 GPUs.
Training with a single process on 1 GPUs.
Creating model
number of params: 29705768
INFO:train:Model vitsnn created, param count:29705768
Model vitsnn created, param count:29705768
INFO:timm.data.config:Data processing configuration for current model + dataset:
Data processing configuration for current model + dataset:
INFO:timm.data.config: input_size: (3, 224, 224)
input_size: (3, 224, 224)
INFO:timm.data.config: interpolation: bicubic
interpolation: bicubic
INFO:timm.data.config: mean: (0.485, 0.456, 0.406)
mean: (0.485, 0.456, 0.406)
INFO:timm.data.config: std: (0.229, 0.224, 0.225)
std: (0.229, 0.224, 0.225)
INFO:timm.data.config: crop_pct: 1.0
crop_pct: 1.0
INFO:train:Using native Torch AMP. Training in mixed precision.
Using native Torch AMP. Training in mixed precision.
ERROR:timm.models.helpers:No checkpoint found at '/media/data/spike-transformer-network/spikingformer_github/imagenet/output/train/Spikingformer_models/checkpoint-284.pth.tar'
ERROR: No checkpoint found at '/media/data/spike-transformer-network/spikingformer_github/imagenet/output/train/Spikingformer_models/checkpoint-284.pth.tar'
Traceback (most recent call last):
File "D:\code\Spikingformer-master\imagenet\test.py", line 639, in
main()
File "D:\code\Spikingformer-master\imagenet\test.py", line 437, in main
resume_epoch = resume_checkpoint(
File "D:\anaconda\envs\g1\lib\site-packages\timm\models\helpers.py", line 113, in resume_checkpoint
raise FileNotFoundError()
FileNotFoundError

from spikingformer.

fangwei123456 avatar fangwei123456 commented on May 23, 2024

加载之前保存的权重失败,文件没找到。

from spikingformer.

liberary233 avatar liberary233 commented on May 23, 2024

我在调包计算FLOPs以及把模型转换成ONNX格式的过程中都遇到了以下报错,


AssertionError Traceback (most recent call last)
/tmp/ipykernel_811/993365423.py in
5 batch_size = 1
6 input_shape = (batch_size, 3, 32, 32)
----> 7 flops, macs, params = calculate_flops(model=model,
8 input_shape=input_shape,
9 output_as_string=True,

~/miniconda3/lib/python3.8/site-packages/calflops/flops_counter.py in calculate_flops(model, input_shape, transformer_tokenizer, args, kwargs, forward_mode, include_backPropagation, compute_bp_factor, print_results, print_detailed, output_as_string, output_precision, output_unit, ignore_modules)
163
164 if forward_mode == 'forward':
--> 165 _ = model(*args)
166 if forward_mode == 'generate':
167 _ = model.generate(*args)

~/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1118 input = bw_hook.setup_input_hook(input)
1119
-> 1120 result = forward_call(input, **kwargs)
1121 if _global_forward_hooks or self._forward_hooks:
1122 for hook in (
_global_forward_hooks.values(), *self._forward_hooks.values()):

~/autodl-fs/20231101_spikformer_cifar10/work/model.py in forward(self, x)
231 def forward(self, x):
232 x = (x.unsqueeze(0)).repeat(self.T, 1, 1, 1, 1)
--> 233 x = self.forward_features(x)
234 x = self.head(x.mean(0))
235 return x

~/autodl-fs/20231101_spikformer_cifar10/work/model.py in forward_features(self, x)
224 patch_embed = getattr(self, f"patch_embed")
225
--> 226 x = patch_embed(x)
227 for blk in block:
228 x = blk(x)

~/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1118 input = bw_hook.setup_input_hook(input)
1119
-> 1120 result = forward_call(input, **kwargs)
1121 if _global_forward_hooks or self._forward_hooks:
1122 for hook in (
_global_forward_hooks.values(), *self._forward_hooks.values()):

~/autodl-fs/20231101_spikformer_cifar10/work/model.py in forward(self, x)
142 x = self.proj_conv(x.flatten(0, 1)) # have some fire value
143 x = self.proj_bn(x).reshape(T, B, -1, H, W).contiguous()
--> 144 x = self.proj_lif(x).flatten(0, 1).contiguous()
145
146 x = self.proj_conv1(x)

~/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1118 input = bw_hook.setup_input_hook(input)
1119
-> 1120 result = forward_call(input, **kwargs)
1121 if _global_forward_hooks or self._forward_hooks:
1122 for hook in (
_global_forward_hooks.values(), *self._forward_hooks.values()):

~/miniconda3/lib/python3.8/site-packages/spikingjelly/clock_driven/neuron.py in forward(self, x_seq)
853 torch.fill_(self.v, v_init)
854
--> 855 spike_seq, self.v_seq = neuron_kernel.MultiStepLIFNodePTT.apply(
856 x_seq.flatten(1), self.v.flatten(0), self.decay_input, self.tau, self.v_threshold, self.v_reset, self.detach_reset, self.surrogate_function.cuda_code)
857

~/miniconda3/lib/python3.8/site-packages/spikingjelly/clock_driven/neuron_kernel.py in forward(ctx, x_seq, v_last, decay_input, tau, v_threshold, v_reset, detach_reset, sg_cuda_code_fun)
755 kernel(
756 (blocks,), (threads,),
--> 757 cu_kernel_opt.wrap_args_to_raw_kernel(
758 device,
759 *kernel_args

~/miniconda3/lib/python3.8/site-packages/spikingjelly/clock_driven/cu_kernel_opt.py in wrap_args_to_raw_kernel(device, *args)
62
63 elif isinstance(item, cupy.ndarray):
---> 64 assert item.device.id == device
65 assert item.flags['C_CONTIGUOUS']
66 ret_list.append(item)

AssertionError:

请问我该如何解决呢?我环境中有装cupy

from spikingformer.

fangwei123456 avatar fangwei123456 commented on May 23, 2024

上面这个错误是在CPU上运行的吗

from spikingformer.

liberary233 avatar liberary233 commented on May 23, 2024

上面这个错误是在CPU上运行的吗

GPU环境下运行的

from spikingformer.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.