Comments (13)
I manage to solve this problem by adding/modifying some paths in the tensorflow files and in the C++ ops.
CUDA 9.0
Cudnn 7005
tensorflow on anaconda3 env by using pip install tensorflow-gpu
1. When I got an error on mutex.h I made de following substitution:
#include "nsync_cv.h" -> #include "...../anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/include/external/nsync/public/nsync_cv.h"
#include "nsync_mu.h" -> #include ".../anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/include/external/nsync/public/nsync_mu.h"
2. When I got an error when calling cuda_device_function.h , cuda_kernel_helper.h and cuda_launch_config.h:
#include "cuda/include/cuda.h" -> #include "/usr/local/cuda/include/cuda.h"
3. On correlation_op.cu.cc
Add below using namespace tensorflow; -> typedef Eigen::GpuDevice GPUDevice;
Thanks for your help.
from unflow.
Example exception stacktrace:
Traceback (most recent call last):
File "/Users/clauslang/UnFlow/src/e2eflow/ops.py", line 61, in
op_lib = tf.load_op_library(lib_path)
File "/Users/clauslang/UnFlow/src/unflow_venv/lib/python3.6/site-packages/tensorflow/python/framework/load_library.py", line 58, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename, status)
File "/Users/clauslang/UnFlow/src/unflow_venv/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: dlopen(./backward_warp_op.so, 6): image not found
from unflow.
Hi, the custom tensorflow operations should automatically compile (if they are missing when executing run.py) to produce the .so files. It worked for me with tensorflow 1.7 and Ubuntu 17.10. Which command did you run to get this output?
from unflow.
Did you solve the problem?
from unflow.
Partially. One problem was that I hadn't installed the cuda toolkit, so the command nvcc wasn't found. Maybe it's obvious, but could be added to dependencies:
sudo apt install nvidia-cuda-toolkit
I get the same error now, but for a different reason:
~/UnFlow/src$ python run.py --help
WARNING:tensorflow:From /home/clauslang/UnFlow/unflow_venv/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Use the retry module or similar alternatives.
In file included from /home/clauslang/UnFlow/unflow_venv/lib/python3.5/site-packages/tensorflow/include/tensorflow/core/util/cuda_kernel_helper.h:21:0,
from backward_warp_op.cu.cc:8:
/home/clauslang/UnFlow/unflow_venv/lib/python3.5/site-packages/tensorflow/include/tensorflow/core/util/cuda_device_functions.h:32:31: fatal error: cuda/include/cuda.h: No such file or directory
compilation terminated.
Traceback (most recent call last):
File "/home/clauslang/UnFlow/src/e2eflow/ops.py", line 61, in
op_lib = tf.load_op_library(lib_path)
File "/home/clauslang/UnFlow/unflow_venv/lib/python3.5/site-packages/tensorflow/python/framework/load_library.py", line 58, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename, status)
File "/home/clauslang/UnFlow/unflow_venv/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: ./backward_warp_op.so: cannot open shared object file: No such file or directory
I use tensorflow 1.7 and Cuda 9.0 on Ubuntu 16.04.
from unflow.
I have the same issue:
Traceback (most recent call last):
File "/media/gsaibro/DATA/InternshipIrcad/FlowNet2/UnFlow-master/src/e2eflow/ops.py", line 81, in
op_lib = tf.load_op_library(lib_path)
File "/home/gsaibro/anaconda3/envs/tensorflowGPU/lib/python3.6/site-packages/tensorflow/python/framework/load_library.py", line 58, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename, status)
File "/home/gsaibro/anaconda3/envs/tensorflowGPU/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: ./backward_warp_op.so: undefined symbol: _ZTIN10tensorflow8OpKernelE
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "run.py", line 19, in
from e2eflow.core.train import Trainer
File "/media/gsaibro/DATA/InternshipIrcad/FlowNet2/UnFlow-master/src/e2eflow/core/train.py", line 12, in
from ..ops import forward_warp
File "/media/gsaibro/DATA/InternshipIrcad/FlowNet2/UnFlow-master/src/e2eflow/ops.py", line 87, in
op_lib = tf.load_op_library(lib_path)
File "/home/gsaibro/anaconda3/envs/tensorflowGPU/lib/python3.6/site-packages/tensorflow/python/framework/load_library.py", line 58, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename, status)
File "/home/gsaibro/anaconda3/envs/tensorflowGPU/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: ./backward_warp_op.so: undefined symbol: _ZTIN10tensorflow8OpKernelE
from unflow.
@simonmeister Hi Simon, How did you install Tensorflow? From the source or using something like anaconda?
from unflow.
Following the discussion tensorflow/tensorflow#15002, I removed the -D GOOGLE_CUDA=1
option from the nvcc command (line 43 in ops.py) and was thus able to produce the backward_warp_op.so file.
Now, I got a similar problem to @gsaibro:
~/UnFlow/src$ python run.py --help
Traceback (most recent call last):
File "/home/clauslang/UnFlow/src/e2eflow/ops.py", line 63, in
op_lib = tf.load_op_library(lib_path)
File "/home/clauslang/UnFlow/unflow_venv/lib/python3.5/site-packages/tensorflow/python/framework/load_library.py", line 58, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename, status)
File "/home/clauslang/UnFlow/unflow_venv/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: ./backward_warp_op.so: undefined symbol: _Z16BackwardWarpGradRKN5Eigen9GpuDeviceENS_9TensorMapINS_6TensorIKfLi4ELi1ElEELi16ENS_11MakePointerEEES8_S8_NS3_INS4_IfLi4ELi1ElEELi16ES7_EE
from unflow.
I used pip to install tensorflow-gpu. @clauslang i get the same issue without using GOOGLE_CUDA, as it doesn't compile the CUDA code in that case. When keeping the flag it works for me.
from unflow.
@clauslang It seems that cuda.h is not found. The current code expects cuda to be in /usr/local/cuda. I am not exactly sure if that is where it is put when you install it with apt. In most cases it's better to use the installer from the NVIDIA site to get a clean install.
from unflow.
Thanks, @simonmeister, for the clarification! I got a bit confused there: I did have cuda installed, but thought I had to install nvcc on top of that (instead of just pointing to the correct cuda install location).
For now, I removed the -D GOOGLE_CUDA=1
flag from both the nvcc and the gcc command and resolved the issue that way for me @gsaibro. Removing it only from the nvcc command indeed results in the same error.
from unflow.
Thanks @simonmeister and @clauslang.
Removing '-D GOOGLE_CUDA=1' I can go through this part, but I still get stucked when calling for downsampling.
Using the '-D GOOGLE_CUDA=1' and setting the environment variables I advanced a little getting an error when trying to build correlation_op.cu.cc, as below in bold. Would you have any guess about what is causing that @simonmeister ? Thanks.
(tensorflow) gsaibro@IHUW074 /media/gsaibro/DATA/InternshipIrcad/FlowNet2/UnFlow-master/src $ python run.py
WARNING:tensorflow:From /home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Use the retry module or similar alternatives.
/home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/include/google/protobuf/arena_impl.h(57): warning: integer conversion resulted in a change of sign
/home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/include/google/protobuf/arena_impl.h(304): warning: integer conversion resulted in a change of sign
/home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/include/google/protobuf/arena_impl.h(305): warning: integer conversion resulted in a change of sign
/home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/include/google/protobuf/arena_impl.h(57): warning: integer conversion resulted in a change of sign
/home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/include/google/protobuf/arena_impl.h(304): warning: integer conversion resulted in a change of sign
/home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/include/google/protobuf/arena_impl.h(305): warning: integer conversion resulted in a change of sign
/home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/include/google/protobuf/generated_message_reflection.h(685): warning: variable "unused" was set but never used
/home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/include/google/protobuf/arena_impl.h(57): warning: integer conversion resulted in a change of sign
/home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/include/google/protobuf/arena_impl.h(304): warning: integer conversion resulted in a change of sign
/home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/include/google/protobuf/arena_impl.h(305): warning: integer conversion resulted in a change of sign
/home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/include/google/protobuf/arena_impl.h(57): warning: integer conversion resulted in a change of sign
/home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/include/google/protobuf/arena_impl.h(304): warning: integer conversion resulted in a change of sign
/home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/include/google/protobuf/arena_impl.h(305): warning: integer conversion resulted in a change of sign
/home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/include/google/protobuf/generated_message_reflection.h(685): warning: variable "unused" was set but never used
/home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/include/google/protobuf/arena_impl.h(57): warning: integer conversion resulted in a change of sign
/home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/include/google/protobuf/arena_impl.h(304): warning: integer conversion resulted in a change of sign
/home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/include/google/protobuf/arena_impl.h(305): warning: integer conversion resulted in a change of sign
correlation_op.cu.cc(249): error: identifier "GPUDevice" is undefined
correlation_op.cu.cc(316): error: identifier "GPUDevice" is undefined
correlation_op.cu.cc(331): warning: variable "kernel_size_" was declared but never referenced
2 errors detected in the compilation of "/tmp/tmpxft_00001c80_00000000-6_correlation_op.cu.cpp1.ii".
Traceback (most recent call last):
File "/media/gsaibro/DATA/InternshipIrcad/FlowNet2/UnFlow-master/src/e2eflow/ops.py", line 63, in
op_lib = tf.load_op_library(lib_path)
File "/home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/load_library.py", line 58, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename, status)
File "/home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: ./correlation_op.so: cannot open shared object file: No such file or directory
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "run.py", line 19, in
from e2eflow.core.train import Trainer
File "/media/gsaibro/DATA/InternshipIrcad/FlowNet2/UnFlow-master/src/e2eflow/core/train.py", line 12, in
from ..ops import forward_warp
File "/media/gsaibro/DATA/InternshipIrcad/FlowNet2/UnFlow-master/src/e2eflow/ops.py", line 65, in
compile(n)
File "/media/gsaibro/DATA/InternshipIrcad/FlowNet2/UnFlow-master/src/e2eflow/ops.py", line 46, in compile
subprocess.check_output(nvcc_cmd, shell=True)
File "/home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/subprocess.py", line 316, in check_output
**kwargs).stdout
File "/home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/subprocess.py", line 398, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command 'nvcc -std=c++11 -c -o correlation_op.cu.o correlation_op.cu.cc -I/home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/include -I/home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow/include/external/nsync/public -D_GLIBCXX_USE_CXX11_ABI=0 -L/home/gsaibro/anaconda3/envs/tensorflow/lib/python3.5/site-packages/tensorflow -ltensorflow_framework -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -I /usr/local --expt-relaxed-constexpr' returned non-zero exit status 1
from unflow.
Had exactly same error and fixed it by downgrading from TF 1.12
to 1.7
without any further adjustments.
CUDA: 9.0, V9.0.176 (installed manually from nvidia)
TensorFlow: 1.7 (pip install tensorflow-gpu==1.7)
Ubuntu 18.04
Python: 3.6.8
from unflow.
Related Issues (20)
- tensorflow.python.framework.errors_impl.NotFoundError: ./backward_warp_op.so: undefined symbol: __cudaPushCallConfiguration HOT 9
- Intended behaviour of np.roll?
- Some question about the result of flownet CS on kitti_trainning_2015
- NoneType in downsample HOT 4
- "step" parameter to load frames has no effect
- error: constexpr function return is non-constant HOT 1
- Unsupervised training questions HOT 2
- Output and input node name of UnFlow.
- Ternary Loss Implementation vs. Official Publication
- How flow vectors are stored and why do we need to do addition in this line of forward warping? HOT 7
- fine tuning pwc trained model HOT 2
- error : .\backward_warp_op.so not found HOT 2
- Any template to train new dataset? HOT 2
- lib_handle = py_tf.TF_LoadLibrary(library_filename) tensorflow.python.framework.errors_impl.NotFoundError: HOW TO GET RIDE OF THIS ERROR HOT 3
- How to train with my downloaded data ? HOT 1
- KeyError: "correlation" HOT 1
- unable to train HOT 1
- Some confusion about census loss
- Code related to evaluation metric HOT 1
- Preprocessing KITTI raw data
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from unflow.