Comments (8)
Hi, are you training on CPU or GPU? How long did you train? Can you give more details of your environment (os, tensorflow version, ...). I trained for multiple days on different machines and it didn't crash due to a leak.
from unflow.
Training with GPU, GTX Titan 1080. Ubuntu 16.04, tensorflow 1.2. During training, the GPU memory didn't explode, but the CPU memory increase continuously until 100%
from unflow.
I noticed that if I set the " tf.app.flags.DEFINE_boolean('debug', True)", the memory consumption will increase fast.
from unflow.
This flag enables recording and writing images to tensorboard, but i don't really know why it should cause a memory leak. On what timescale does the memory increase?
from unflow.
50000 iteration, it will consume near 30G memory
from unflow.
Is the 30GB with or without the debug Flag? The flag isn't meant to be used during full training. Could you maybe share your complete config.ini file with me?
from unflow.
The 30GB with the debug Flag, this is the config.ini.
config.ini.txt
from unflow.
As I was not able to reproduce, I am closing this for now.
from unflow.
Related Issues (20)
- tensorflow.python.framework.errors_impl.NotFoundError: ./backward_warp_op.so: undefined symbol: __cudaPushCallConfiguration HOT 9
- Intended behaviour of np.roll?
- Some question about the result of flownet CS on kitti_trainning_2015
- NoneType in downsample HOT 4
- "step" parameter to load frames has no effect
- error: constexpr function return is non-constant HOT 1
- Unsupervised training questions HOT 2
- Output and input node name of UnFlow.
- Ternary Loss Implementation vs. Official Publication
- How flow vectors are stored and why do we need to do addition in this line of forward warping? HOT 7
- fine tuning pwc trained model HOT 2
- error : .\backward_warp_op.so not found HOT 2
- Any template to train new dataset? HOT 2
- lib_handle = py_tf.TF_LoadLibrary(library_filename) tensorflow.python.framework.errors_impl.NotFoundError: HOW TO GET RIDE OF THIS ERROR HOT 3
- How to train with my downloaded data ? HOT 1
- KeyError: "correlation" HOT 1
- unable to train HOT 1
- Some confusion about census loss
- Code related to evaluation metric HOT 1
- Preprocessing KITTI raw data
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from unflow.