when I run python -m src.flownet_s.train ，I get print

The same problem happens to me,and I block out the code as <a class="user-mention notr

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Please enable debug to check the exact error: <code class="notransla

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Segmentation fault (core dumped) about flownet2-tf HOT 13 CLOSED

sampepose commented on June 14, 2024 1

Segmentation fault (core dumped)

from flownet2-tf.

Comments (13)

yinjunbo commented on June 14, 2024 2

Finally，I find it‘s a problem about the version of g++/gcc .
The default version when compiling MakeFile is 5.4.1 and it turns out that problem.
When I change the version of g++/gcc to 4.8.4 , everything works out well.
CC = gcc-4.8 -O2 -pthread CXX = g++-4.8

from flownet2-tf.

yinjunbo commented on June 14, 2024 1

The same problem happens to me,and I block out the code as @chuchienshu .
But new problem comes out.The training process stop at an early step(0,30,60etc,random),and report "LossTensor is inf or nan".
Have you met the same problem? Thank you very much! @sampepose @chuchienshu @vladpaunescu

I discover that the problem may owe to the wrong version of tf1.2 compiling the code.I change the version to 1.3,and recompile the code.Finally it works.

from flownet2-tf.

yinjunbo commented on June 14, 2024 1

@myhooo Try to add this code before "return..."
image_as, image_bs, flows = map(lambda x: tf.expand_dims(x, 0), [image_a, image_b, flow])
and don't forget to change corresponding variable in tf.train.batch .

from flownet2-tf.

vladpaunescu commented on June 14, 2024

Please enable debug to check the exact error:

net = FlowNetS(debug=True)

If the error is that (this was in my case) :

Couldn't open CUDA library libcupti.so.8.0. LD_LIBRARY_PATH: /usr/local/cuda-8.0/lib64:/usr/local/cuda/lib64:/usr/local/cuda/lib64:/usr/local/cuda/lib:/usr/local/cuda/lib64:/usr/local/cuda/lib:/opt/ffmpeg/lib
2017-09-29 18:35:26.710728: F ./tensorflow/stream_executor/lib/statusor.h:212] Non-OK-status: status_ status: Failed precondition: could not dlopen DSO: libcupti.so.8.0; dlerror: libcupti.so.8.0: cannot open shared object file: No such file or directory

Please add /usr/local/cuda/extras/CUPTI/lib64/ to your LD_LIBRARY_PATH

tensorflow/tensorflow#8830

from flownet2-tf.

vladpaunescu commented on June 14, 2024

Actually, the Segmentation Fault is still happening to me.
After running
gdb python
run src/flownet_s/train.py
I found this:

QueueRunner: corrupted record at 275253737

LE:

It seems SIGSEG is caused in C++ augmentation prerpocessing.so:

Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fff7a7fc700 (LWP 13435)] std::_Function_handler<void(long long int, long long int), tensorflow::Augment(tensorflow::OpKernelContext*, const Device&, int, int, int, int, int, int, int, float const*, float*, float const*, float*) [with Device = Eigen::ThreadPoolDevice]::<lambda(tensorflow::int64, tensorflow::int64)> >::_M_invoke(const std::_Any_data &, <unknown type in /mnt/hdd1/git/caffe_experiments/obstacle-detection/flow/flownet2-tf/src/./ops/build/preprocessing.so, CU 0x72971, DIE 0xb7a61>, <unknown type in /mnt/hdd1/git/caffe_experiments/obstacle-detection/flow/flownet2-tf/src/./ops/build/preprocessing.so, CU 0x72971, DIE 0xb7a66>) (__functor=..., __args#0=<unknown type in /mnt/hdd1/git/caffe_experiments/obstacle-detection/flow/flownet2-tf/src/./ops/build/preprocessing.so, CU 0x72971, DIE 0xb7a61>, __args#1=<unknown type in /mnt/hdd1/git/caffe_experiments/obstacle-detection/flow/flownet2-tf/src/./ops/build/preprocessing.so, CU 0x72971, DIE 0xb7a66>) at /usr/include/c++/5/functional:1871 1871 (*_Base::_M_get_pointer(__functor))(

from flownet2-tf.

chuchienshu commented on June 14, 2024

Thanks for your kindness, bro. I bow to your judgement, but I have no ability to change the code of prerpocessing.cc. Fortunately, I got ideal result via comment part of that code and did some delicate modify.Thanks the same! @vladpaunescu

from flownet2-tf.

sampepose commented on June 14, 2024

Can you please send in a pull request or let me know what you changed? SP

…

On Sep 30, 2017, at 5:18 AM, chuchienshu ***@***.***> wrote: Thanks for your kindness, bro. I bow to your judgement, but I have no ability to change the code of prerpocessing.cc. Fortunately, I got ideal result via comment part of that code and did some delicate modify.Thanks the same! @vladpaunescu — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

from flownet2-tf.

vladpaunescu commented on June 14, 2024

@chuchienshu No worries 👍

Can you share the fix, cause I still can't train any model at all? All I get is Segmentation Fault.
Also, it could help what is the exact TensorFlow version on which the code was implemented, as well as python version, and CUDA with CuDNN version. @sampepose can you kindly add the requirements in README.md?

from flownet2-tf.

vladpaunescu commented on June 14, 2024

Hi!
I managed to train without any data augmentation. However it would be nice to have the C++ code working. So, any advice could be of great value for me.

Vlad

from flownet2-tf.

chuchienshu commented on June 14, 2024

@sampepose I just removed the data augmentation code at src/dataloader.py
like below.

'''crop = [dataset_config['PREPROCESS']['crop_height'],
                dataset_config['PREPROCESS']['crop_width']]
        config_a = config_to_arrays(dataset_config['PREPROCESS']['image_a'])
        ......
 # Perform flow augmentation using spatial parameters from data augmentation
            flows = _preprocessing_ops.flow_augmentation(
                flows, transforms_from_a, transforms_from_b, crop)'''

@vladpaunescu Looks you did the same, and I wish the C++ code working, too.It looks awesome!

from flownet2-tf.

myhooo commented on June 14, 2024

@yinjunbo Hi, I meet the same problem 'segmentation fault(core dumped)' and I remove the data augmentation code like the chuchienshu said. However, there are problems like this
File "/home/hmy/flownet2_tf/src/flownet2/train.py", line 22, in <module> './checkpoints/FlowNetSD/flownet-SD.ckpt-0': ('FlowNet2/FlowNetSD', 'FlowNet2') File "src/net.py", line 99, in train predictions = self.model(inputs, training_schedule) File "src/flownet2/flownet2.py", line 19, in model _, height, width, _ = inputs['input_a'].shape.as_list() ValueError: need more than 3 values to unpack
Have you met the same situation? Once you have solved such problems, it will be very kind of you to share some tips with me, thanks~

from flownet2-tf.

myhooo commented on June 14, 2024

@yinjunbo Thank you very much~ It seems that you have recompiled successfully, congratulation!
And I want to know if you've just changed the version of tensorflow?
Thank you in advance~

from flownet2-tf.

yinjunbo commented on June 14, 2024

@myhooo You're welcome. I've recompiled it without data augmentation,and the version 1.2.0 seems to work.

from flownet2-tf.

Segmentation fault (core dumped) about flownet2-tf HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent