I try the newest code update 6.28. And the test_1024p.sh still meet the out of memory

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

RuntimeError: dimension specified as 0 but tensor has no dimensions about pix2pixhd HOT 8 OPEN

nvidia commented on May 17, 2024 3

RuntimeError: dimension specified as 0 but tensor has no dimensions

from pix2pixhd.

Comments (8)

cientgu commented on May 17, 2024 6

i find a simple solution to fix it: in pix2pixHD_model.py, reshape the five losses in forward function like: loss_G_GAN = loss_G_GAN.reshape(1)

from pix2pixhd.

lkkchung commented on May 17, 2024 1

Easiest fix for me was to roll back pytorch. conda install pytorch=0.3.1 did the trick for me.

from pix2pixhd.

commented on May 17, 2024

I have the same issue.
My code is working on single GPU (EC2 p2.xlarge instance), but get similar error running on multiple GPU (EC2 p2.8xlarge).

I launch the train command as:
python train.py --name xxxx --dataroot ./datasets/xxxx/ --resize_or_crop none --loadSize 512 --fineSize 512 --label_nc 0 --no_instance --no_flip --verbose --batchSize 8 --gpu_ids 0,1,2,3,4,5,6,7

The model is created but then I get this error:

Traceback (most recent call last): File "train.py", line 61, in <module> Variable(data['image']), Variable(data['feat']), infer=save_fake) File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__ result = self.forward(*input, **kwargs) File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 115, in forward return self.gather(outputs, self.output_device) File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 127, in gather return gather(outputs, output_device, dim=self.dim) File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/parallel/scatter_gather.py", line 68, in gather return gather_map(outputs) File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/parallel/scatter_gather.py", line 63, in gather_map return type(out)(map(gather_map, zip(*outputs))) File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/parallel/scatter_gather.py", line 63, in gather_map return type(out)(map(gather_map, zip(*outputs))) File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/parallel/scatter_gather.py", line 55, in gather_map return Gather.apply(target_device, dim, *outputs) File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/parallel/_functions.py", line 54, in forward ctx.input_sizes = tuple(map(lambda i: i.size(ctx.dim), inputs)) File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/parallel/_functions.py", line 54, in <lambda> ctx.input_sizes = tuple(map(lambda i: i.size(ctx.dim), inputs)) RuntimeError: dimension specified as 0 but tensor has no dimensions

Pytorch version: 0.4.0

Maybe this is related...

from pix2pixhd.

cientgu commented on May 17, 2024

@ouyangkid I have the same issue, did you find out how to fix it? i think maybe we should rewrite the multigpu code.

from pix2pixhd.

tcwang0509 commented on May 17, 2024

This is because new pytorch version does not accept scalars as losses. Just add something like
loss_list = [loss.unsqueeze(0) for loss in loss_list] before the model returns and it should work.

from pix2pixhd.

cientgu commented on May 17, 2024

I have rewrited the torch/nn/parallel/scatter_gather.py code and it works, Thanks for your reply. From: [email protected] <[email protected]> On Behalf Of Ting-Chun Wang Sent: Friday, August 3, 2018 6:46 AM To: NVIDIA/pix2pixHD <[email protected]> Cc: Shuyang Gu <[email protected]>; Manual <[email protected]> Subject: Re: [NVIDIA/pix2pixHD] RuntimeError: dimension specified as 0 but tensor has no dimensions (#42) This is because new pytorch version does not accept scalars as losses. Just add something like loss_list = [loss.unsqueeze(0) for loss in loss_list] before the model returns and it should work. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#42 (comment)> , or mute the thread <https://github.com/notifications/unsubscribe-auth/AVdfefh1dIkt9SDYzslTHC1OE8AVu_0pks5uM4EQgaJpZM4U7EOW> . <https://github.com/notifications/beacon/AVdfeRN3OU0mT1XnJywJOMF6jWSMEvKZks5uM4EQgaJpZM4U7EOW.gif>

from pix2pixhd.

hahakid commented on May 17, 2024

@cientgu great work, I will try your solution when I finished some of my works.
And what's your pytorch version?

from pix2pixhd.

cientgu commented on May 17, 2024

@ouyangkid 0.4.0

from pix2pixhd.

RuntimeError: dimension specified as 0 but tensor has no dimensions about pix2pixhd HOT 8 OPEN

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent