Comments (16)
I would check to make sure the gpu is really not being used at all. You can use nvidia-smi
to check how much gpu memory is being used. I haven't tested CPU mode carefully so there's a chance some memory is still allocated on the gpu.
If that's not the issue, then it's a bit surprising you would run out of memory. Can you profile how much memory your system is using for the process?
General strategies for reducing memory usage would be to reduce the batch size (try setting it to 1), and reduce the image size (will require modifying the net architectures, e.g., by removing the first and last layers of netG and the first layer of netD).
from pix2pix.
I turned off the display and the code ran fine. On the CPU it is pretty slow, of course, so I'm going to run it on an AWS GPU instance now.
from pix2pix.
I have 1.4GB free as I can limit the memory that use the gpu?
...
Epoch: [1][ 399 / 400] Time: 3.719 DataTime: 0.001 Err_G: 3.4267 Err_D: 0.0299 ErrL1: 0.3935
End of epoch 1 / 200 Time Taken: 938.975
THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-4014/cutorch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory
...
from pix2pix.
How to turn off the display?
from pix2pix.
How to turn off the display?
from pix2pix.
On the command line, you can pass display=0
from pix2pix.
i am facing the following error when running training command:
transferring to gpu...
done
THCudaCheck FAIL file=/home/admink/torch/extra/cutorch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory
/home/admink/torch/install/bin/luajit: /home/admink/torch/install/share/lua/5.1/nn/Module.lua:309: cuda runtime error (2) : out of memory at /home/admink/torch/extra/cutorch/lib/THC/generic/THCStorage.cu:66
stack traceback:
[C]: in function 'Tensor'
/home/admink/torch/install/share/lua/5.1/nn/Module.lua:309: in function 'flatten'
/home/admink/torch/install/share/lua/5.1/nn/Module.lua:326: in function 'getParameters'
train.lua:445: in main chunk
[C]: in function 'dofile'
...mink/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50
How could this issue be resolved?
from pix2pix.
because i still get the same error again and again even if i switched off the display.
i could train fine with just 3 images of size 600600 and maximum of 10 images of size 128128.
More than that i fail to train the model with the error i mentioned above. How could i train the model on more images at least 500?
from pix2pix.
Some ideas are here: #107
Other quick fixes:
- Run in CPU mode
- Run on a cloud service that has more GPU memory, e.g., Amazon EC2
- Make models smaller (e.g., set
ngf
andndf
to be 32 rather than 64)
Really it should be possible to avoid running out of memory as you scale the dataset size. The memory should be constant for any sized dataset as images are only loaded on demand. There may be a leak or some inefficiency that causes more memory to be used for larger datasets. I don't have a quick fix, but I would look at the behavior at the end of an epoch, where temporary variables are cleared and reallocated, since this is where you are getting the memory error.
from pix2pix.
from pix2pix.
It shouldn't.
from pix2pix.
from pix2pix.
The GPU is much faster :) Functionality should be the same, although I haven't thoroughly tested CPU mode myself (seems like others have had it work).
from pix2pix.
from pix2pix.
haha yeah it's an unfortunate tradeoff
from pix2pix.
from pix2pix.
Related Issues (20)
- Question: Running [make_dataset.py] with the sample data set causes an error HOT 1
- About use my own dataset for test
- Input to generator
- pix2pix with 16-bit channels? HOT 1
- RuntimeError
- fcn-8s-cityscapes weight link failure HOT 3
- How should I modify the model(G and D) structure if my input are all one-hot encoded matrix(3-D tensor) which only contains either 0 or 1 HOT 1
- Size for evaluation in Photo2Label task with cityscapes datasets HOT 1
- the code for evaluation in Photo2label of datasets-cityscapes might be wrong? HOT 1
- pytorch or tensorflow??? HOT 1
- Getting Started Installation Instructions. HOT 2
- About the receptive field of the discriminator achitecture HOT 2
- Easy idea about increasing pix2pix target underestanding HOT 1
- something to help people visualise)
- How to make Custom Pix2Pix model, please use if you want
- If it's possible to modify pix2pix for the vector but not image data? HOT 3
- hi is any one online here? please ping me i want to discuss the issues i am facing while installing the torch requirements HOT 4
- Google Colab Notebook
- Take two inputs for one output
- Guidance Needed for Selecting Best Epoch/Weights in Pix2Pix Training HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pix2pix.