Comments (8)
i find a simple solution to fix it: in pix2pixHD_model.py, reshape the five losses in forward function like: loss_G_GAN = loss_G_GAN.reshape(1)
from pix2pixhd.
Easiest fix for me was to roll back pytorch. conda install pytorch=0.3.1
did the trick for me.
from pix2pixhd.
I have the same issue.
My code is working on single GPU (EC2 p2.xlarge instance), but get similar error running on multiple GPU (EC2 p2.8xlarge).
I launch the train command as:
python train.py --name xxxx --dataroot ./datasets/xxxx/ --resize_or_crop none --loadSize 512 --fineSize 512 --label_nc 0 --no_instance --no_flip --verbose --batchSize 8 --gpu_ids 0,1,2,3,4,5,6,7
The model is created but then I get this error:
Traceback (most recent call last): File "train.py", line 61, in <module> Variable(data['image']), Variable(data['feat']), infer=save_fake) File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__ result = self.forward(*input, **kwargs) File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 115, in forward return self.gather(outputs, self.output_device) File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 127, in gather return gather(outputs, output_device, dim=self.dim) File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/parallel/scatter_gather.py", line 68, in gather return gather_map(outputs) File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/parallel/scatter_gather.py", line 63, in gather_map return type(out)(map(gather_map, zip(*outputs))) File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/parallel/scatter_gather.py", line 63, in gather_map return type(out)(map(gather_map, zip(*outputs))) File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/parallel/scatter_gather.py", line 55, in gather_map return Gather.apply(target_device, dim, *outputs) File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/parallel/_functions.py", line 54, in forward ctx.input_sizes = tuple(map(lambda i: i.size(ctx.dim), inputs)) File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/parallel/_functions.py", line 54, in <lambda> ctx.input_sizes = tuple(map(lambda i: i.size(ctx.dim), inputs)) RuntimeError: dimension specified as 0 but tensor has no dimensions
Pytorch version: 0.4.0
Maybe this is related...
from pix2pixhd.
@ouyangkid I have the same issue, did you find out how to fix it? i think maybe we should rewrite the multigpu code.
from pix2pixhd.
This is because new pytorch version does not accept scalars as losses. Just add something like
loss_list = [loss.unsqueeze(0) for loss in loss_list] before the model returns and it should work.
from pix2pixhd.
from pix2pixhd.
@cientgu great work, I will try your solution when I finished some of my works.
And what's your pytorch version?
from pix2pixhd.
@ouyangkid 0.4.0
from pix2pixhd.
Related Issues (20)
- I trained so poorly? HOT 5
- RuntimeError: CUDA out of memory,continuous training? HOT 5
- Low performance compared to pix2pix
- Errors during testing
- ./checkpoints/label2city_1024p_feat/latest_net_E.pth not exists yet!
- The test results were not good
- module 'torch._C' has no attribute '_cuda_setDevice' HOT 1
- Regarding High Dynamic Range Images HOT 1
- Regarding the inclusion of classification criteria during training.
- The training effect is good, but the test effect is poor. HOT 2
- Hello, I only care yellow color loss, how to improve my loss function
- How does layer-wise feature matching help with discriminator and GAN training objective?
- .
- Guidance Needed for Selecting Best Epoch/Weights in Pix2PixHD Training
- Update code for new version of python
- How to solve the RuntimeError: data set to a tensor that requires gradients must be floating point or complex dtype
- not work in python 3.10
- Edge2face experiment with CelebA-HQ
- Issues with Running stylegan2_pytorch in gpu settings on colab notebook
- trained result totally different than the test result HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pix2pixhd.