Comments (13)
Finally,I find it‘s a problem about the version of g++/gcc .
The default version when compiling MakeFile is 5.4.1 and it turns out that problem.
When I change the version of g++/gcc to 4.8.4 , everything works out well.
CC = gcc-4.8 -O2 -pthread CXX = g++-4.8
from flownet2-tf.
The same problem happens to me,and I block out the code as @chuchienshu .
But new problem comes out.The training process stop at an early step(0,30,60etc,random),and report "LossTensor is inf or nan".
Have you met the same problem? Thank you very much! @sampepose @chuchienshu @vladpaunescu
I discover that the problem may owe to the wrong version of tf1.2 compiling the code.I change the version to 1.3,and recompile the code.Finally it works.
from flownet2-tf.
@myhooo Try to add this code before "return..."
image_as, image_bs, flows = map(lambda x: tf.expand_dims(x, 0), [image_a, image_b, flow])
and don't forget to change corresponding variable in tf.train.batch .
from flownet2-tf.
Please enable debug to check the exact error:
net = FlowNetS(debug=True)
If the error is that (this was in my case) :
Couldn't open CUDA library libcupti.so.8.0. LD_LIBRARY_PATH: /usr/local/cuda-8.0/lib64:/usr/local/cuda/lib64:/usr/local/cuda/lib64:/usr/local/cuda/lib:/usr/local/cuda/lib64:/usr/local/cuda/lib:/opt/ffmpeg/lib
2017-09-29 18:35:26.710728: F ./tensorflow/stream_executor/lib/statusor.h:212] Non-OK-status: status_ status: Failed precondition: could not dlopen DSO: libcupti.so.8.0; dlerror: libcupti.so.8.0: cannot open shared object file: No such file or directory
Please add /usr/local/cuda/extras/CUPTI/lib64/ to your LD_LIBRARY_PATH
from flownet2-tf.
Actually, the Segmentation Fault is still happening to me.
After running
gdb python
run src/flownet_s/train.py
I found this:
QueueRunner: corrupted record at 275253737
LE:
It seems SIGSEG is caused in C++ augmentation prerpocessing.so
:
Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fff7a7fc700 (LWP 13435)] std::_Function_handler<void(long long int, long long int), tensorflow::Augment(tensorflow::OpKernelContext*, const Device&, int, int, int, int, int, int, int, float const*, float*, float const*, float*) [with Device = Eigen::ThreadPoolDevice]::<lambda(tensorflow::int64, tensorflow::int64)> >::_M_invoke(const std::_Any_data &, <unknown type in /mnt/hdd1/git/caffe_experiments/obstacle-detection/flow/flownet2-tf/src/./ops/build/preprocessing.so, CU 0x72971, DIE 0xb7a61>, <unknown type in /mnt/hdd1/git/caffe_experiments/obstacle-detection/flow/flownet2-tf/src/./ops/build/preprocessing.so, CU 0x72971, DIE 0xb7a66>) (__functor=..., __args#0=<unknown type in /mnt/hdd1/git/caffe_experiments/obstacle-detection/flow/flownet2-tf/src/./ops/build/preprocessing.so, CU 0x72971, DIE 0xb7a61>, __args#1=<unknown type in /mnt/hdd1/git/caffe_experiments/obstacle-detection/flow/flownet2-tf/src/./ops/build/preprocessing.so, CU 0x72971, DIE 0xb7a66>) at /usr/include/c++/5/functional:1871 1871 (*_Base::_M_get_pointer(__functor))(
from flownet2-tf.
Thanks for your kindness, bro. I bow to your judgement, but I have no ability to change the code of prerpocessing.cc
. Fortunately, I got ideal result via comment part of that code and did some delicate modify.Thanks the same! @vladpaunescu
from flownet2-tf.
from flownet2-tf.
@chuchienshu No worries 👍
Can you share the fix, cause I still can't train any model at all? All I get is Segmentation Fault.
Also, it could help what is the exact TensorFlow version on which the code was implemented, as well as python version, and CUDA with CuDNN version. @sampepose can you kindly add the requirements in README.md?
from flownet2-tf.
Hi!
I managed to train without any data augmentation. However it would be nice to have the C++ code working. So, any advice could be of great value for me.
Vlad
from flownet2-tf.
@sampepose I just removed the data augmentation code at src/dataloader.py
like below.
'''crop = [dataset_config['PREPROCESS']['crop_height'],
dataset_config['PREPROCESS']['crop_width']]
config_a = config_to_arrays(dataset_config['PREPROCESS']['image_a'])
......
# Perform flow augmentation using spatial parameters from data augmentation
flows = _preprocessing_ops.flow_augmentation(
flows, transforms_from_a, transforms_from_b, crop)'''
@vladpaunescu Looks you did the same, and I wish the C++ code working, too.It looks awesome!
from flownet2-tf.
@yinjunbo Hi, I meet the same problem 'segmentation fault(core dumped)' and I remove the data augmentation code like the chuchienshu said. However, there are problems like this
File "/home/hmy/flownet2_tf/src/flownet2/train.py", line 22, in <module> './checkpoints/FlowNetSD/flownet-SD.ckpt-0': ('FlowNet2/FlowNetSD', 'FlowNet2') File "src/net.py", line 99, in train predictions = self.model(inputs, training_schedule) File "src/flownet2/flownet2.py", line 19, in model _, height, width, _ = inputs['input_a'].shape.as_list() ValueError: need more than 3 values to unpack
Have you met the same situation? Once you have solved such problems, it will be very kind of you to share some tips with me, thanks~
from flownet2-tf.
@yinjunbo Thank you very much~ It seems that you have recompiled successfully, congratulation!
And I want to know if you've just changed the version of tensorflow?
Thank you in advance~
from flownet2-tf.
@myhooo You're welcome. I've recompiled it without data augmentation,and the version 1.2.0 seems to work.
from flownet2-tf.
Related Issues (20)
- Error compiling cuda_kernel_helper.h: No such file or directory HOT 3
- Why do we need flow = predict_flow2 * 20.0 in flownet_s.py
- undefined symbol: _ZN10tensorflow12OpDefBuilder4AttrESs HOT 2
- Which tensorflow version to use?
- Can't download pretrained weights from aws HOT 2
- Can't download pretrained weights HOT 4
- AttributeError: 'module' object has no attribute 'contrib'
- No OpKernel was registered... HOT 1
- make error HOT 3
- Update of FN2 to TF2.x?
- Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR HOT 1
- how to transfer the ppm and flo data to the tfrecord ones?
- error: no module named png HOT 1
- Different error when running test: Undefined symbol: _ZN10tensorflow3PadERKN5Eigen9GpuDeviceEPKfiiiiiiPf HOT 10
- I got a very large number of the flow?
- About augmentation
- What version of Tensorflow can I use for training?
- make error
- ImportError: No module named 'matplotlib._path'
- NotFoundError:\.\ops\build\downsample.so not found HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from flownet2-tf.