Code Monkey home page Code Monkey logo

cuda_tensorflow_opencv's People

Contributors

mmartial avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cuda_tensorflow_opencv's Issues

RuntimeError: CUDA runtime implicit initialization on GPU:0 failed. Status: device kernel image is invalid

Hi,
Thanks for sharing the images and comprehensive instructions! They are very helpful!

When I use this image datamachines/cudnn_tensorflow_opencv:10.2_2.3.1_4.5.0-20201204, tf will have problems. In fact when I run the two following lines from python,
from tensorflow.python.client import device_lib print(device_lib.list_local_devices())
it will result in this error:

  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/framework/test_util.py", line 131, in gpu_device_name
    for x in device_lib.list_local_devices():
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/client/device_lib.py", line 43, in list_local_devices
    _convert(s) for s in _pywrap_device_lib.list_devices(serialized_config)
RuntimeError: CUDA runtime implicit initialization on GPU:0 failed. Status: device kernel image is invalid```

1.15.2 images are built without tensorflow-gpu (irrespective of CUDA or CUDNN)

Looks like all 1.15.2 images are built with tensorflow and not tensorflow-gpu.

In Makefile:
@$(eval CTO_TENSORFLOW_PYTHON=$(shell if [ "A${CTO_TMP}" == "A${STABLE_TF1}" ]; then echo "tensorflow==${STABLE_TF1}"; else if [ ${CTO_SC} == 1 ]; then echo "tensorflow==${STABLE_TF2}"; else echo "tensorflow-gpu==${STABLE_TF2}"; fi; fi))

Is this intentional?

Support for the latest version of PyTorch 1.8.0, Tensorflow 2.3.1 and OpenCV 4.5.0

@mmartial - Stumbled across your repo and found it to be quite handy! Thanks for putting this together.

We're looking to create a docker image that contains the latest combinations of:
-Tensorflow 2.3.1
-Cuda 10.1, 10.2, 11.0
-PyTorch 1.7.0 and corresponding versions of torchvision (unsure which version is correct)

  • OpenCV 4.5.0

Any chance you could add support for these combinations?

Thanks!

Build failed - make cudnn_tensorflow_opencv-10.2_2.4.1_4.5.2

Hi
I would like to build an image with cuda support 10.2 - tf 2.4.1 and opencv 4.5.2 but the build failed when compiling tensorflow.
Thanks for your help

[18,609 / 24,292] Compiling tensorflow/core/grappler/optimizers/arithmetic_optimizer.cc [for host]; 16s local ... (12 actions running)
[18,875 / 24,292] Compiling tensorflow/core/kernels/conv_2d_gpu_uint8.cu.cc [for host]; 84s local ... (12 actions running)
[19,033 / 24,292] Compiling tensorflow/core/kernels/pad_op_gpu.cu.cc [for host]; 189s local ... (12 actions running)
[19,153 / 24,292] Compiling tensorflow/core/kernels/pad_op_gpu.cu.cc [for host]; 636s local ... (12 actions running)
[19,154 / 24,292] Compiling tensorflow/core/kernels/pad_op_gpu.cu.cc [for host]; 1168s local ... (12 actions, 11 running)
ERROR: /usr/local/src/tensorflow/tensorflow/core/kernels/linalg/BUILD:193:18: C++ compilation of rule '//tensorflow/core/kernels/linalg:matrix_square_root_op' failed (Exit 4): crosstool_wrapper_driver_is_not_gcc failed: error executing command
(cd /root/.cache/bazel/bazel_root/bbcc73fcc5c2b01ab08b6bcf7c29e42e/execroot/org_tensorflow &&
exec env -
LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64
PATH=/root/.cache/bazelisk/downloads/bazelbuild/bazel-3.7.2-linux-x86_64/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/proc/self/cwd
external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -MD -MF bazel-out/host/bin/tensorflow/core/kernels/linalg/objs/matrix_square_root_op/matrix_square_root_op.pic.d '-frandom-seed=bazel-out/host/bin/tensorflow/core/kernels/linalg/objs/matrix_square_root_op/matrix_square_root_op.pic.o' -DTENSORFLOW_USE_CUSTOM_CONTRACTION_KERNEL -DTENSORFLOW_USE_MKLDNN_CONTRACTION_KERNEL -DHAVE_SYS_UIO_H -DTF_USE_SNAPPY -DCURL_STATICLIB -DPLATFORM_LINUX -DENABLE_CURL_CLIENT -DOPENSSL_IS_BORINGSSL -DEIGEN_MPL2_ONLY '-DEIGEN_MAX_ALIGN_BYTES=64' '-DEIGEN_HAS_TYPE_TRAITS=0' -D__CLANG_SUPPORT_DYN_ANNOTATION -iquote . -iquote bazel-out/host/bin -iquote external/com_google_absl -iquote bazel-out/host/bin/external/com_google_absl -iquote external/nsync -iquote bazel-out/host/bin/external/nsync -iquote external/eigen_archive -iquote bazel-out/host/bin/external/eigen_archive -iquote external/gif -iquote bazel-out/host/bin/external/gif -iquote external/libjpeg_turbo -iquote bazel-out/host/bin/external/libjpeg_turbo -iquote external/com_google_protobuf -iquote bazel-out/host/bin/external/com_google_protobuf -iquote external/com_googlesource_code_re2 -iquote bazel-out/host/bin/external/com_googlesource_code_re2 -iquote external/farmhash_archive -iquote bazel-out/host/bin/external/farmhash_archive -iquote external/fft2d -iquote bazel-out/host/bin/external/fft2d -iquote external/highwayhash -iquote bazel-out/host/bin/external/highwayhash -iquote external/zlib -iquote bazel-out/host/bin/external/zlib -iquote external/local_config_cuda -iquote bazel-out/host/bin/external/local_config_cuda -iquote external/local_config_tensorrt -iquote bazel-out/host/bin/external/local_config_tensorrt -iquote external/double_conversion -iquote bazel-out/host/bin/external/double_conversion -iquote external/snappy -iquote bazel-out/host/bin/external/snappy -iquote external/curl -iquote bazel-out/host/bin/external/curl -iquote external/boringssl -iquote bazel-out/host/bin/external/boringssl -iquote external/jsoncpp_git -iquote bazel-out/host/bin/external/jsoncpp_git -iquote external/aws -iquote bazel-out/host/bin/external/aws -iquote external/aws-c-common -iquote bazel-out/host/bin/external/aws-c-common -iquote external/aws-c-event-stream -iquote bazel-out/host/bin/external/aws-c-event-stream -iquote external/aws-checksums -iquote bazel-out/host/bin/external/aws-checksums -iquote external/mkl_dnn -iquote bazel-out/host/bin/external/mkl_dnn -Ibazel-out/host/bin/external/local_config_cuda/cuda/virtual_includes/cuda_headers_virtual -Ibazel-out/host/bin/external/local_config_tensorrt/virtual_includes/tensorrt_headers -Ibazel-out/host/bin/external/local_config_cuda/cuda/virtual_includes/cudnn_header -Ibazel-out/host/bin/external/local_config_cuda/cuda/virtual_includes/cublas_headers_virtual -Ibazel-out/host/bin/external/local_config_cuda/cuda/virtual_includes/cusolver_headers_virtual -isystem external/nsync/public -isystem bazel-out/host/bin/external/nsync/public -isystem external/eigen_archive -isystem bazel-out/host/bin/external/eigen_archive -isystem external/gif -isystem bazel-out/host/bin/external/gif -isystem external/com_google_protobuf/src -isystem bazel-out/host/bin/external/com_google_protobuf/src -isystem external/farmhash_archive/src -isystem bazel-out/host/bin/external/farmhash_archive/src -isystem external/zlib -isystem bazel-out/host/bin/external/zlib -isystem external/local_config_cuda/cuda -isystem bazel-out/host/bin/external/local_config_cuda/cuda -isystem external/local_config_cuda/cuda/cuda/include -isystem bazel-out/host/bin/external/local_config_cuda/cuda/cuda/include -isystem external/double_conversion -isystem bazel-out/host/bin/external/double_conversion -isystem external/curl/include -isystem bazel-out/host/bin/external/curl/include -isystem external/boringssl/src/include -isystem bazel-out/host/bin/external/boringssl/src/include -isystem external/jsoncpp_git/include -isystem bazel-out/host/bin/external/jsoncpp_git/include -isystem external/aws/aws-cpp-sdk-core/include -isystem bazel-out/host/bin/external/aws/aws-cpp-sdk-core/include -isystem external/aws/aws-cpp-sdk-s3/include -isystem bazel-out/host/bin/external/aws/aws-cpp-sdk-s3/include -isystem external/aws/aws-cpp-sdk-transfer/include -isystem bazel-out/host/bin/external/aws/aws-cpp-sdk-transfer/include -isystem external/aws-c-common/include -isystem bazel-out/host/bin/external/aws-c-common/include -isystem external/aws-c-event-stream/include -isystem bazel-out/host/bin/external/aws-c-event-stream/include -isystem external/aws-checksums/include -isystem bazel-out/host/bin/external/aws-checksums/include -isystem external/local_config_cuda/cuda/cublas/include -isystem bazel-out/host/bin/external/local_config_cuda/cuda/cublas/include -isystem external/local_config_cuda/cuda/cusolver/include -isystem bazel-out/host/bin/external/local_config_cuda/cuda/cusolver/include -isystem external/mkl_dnn/include -isystem bazel-out/host/bin/external/mkl_dnn/include -isystem external/mkl_dnn/src -isystem bazel-out/host/bin/external/mkl_dnn/src -isystem external/mkl_dnn/src/common -isystem bazel-out/host/bin/external/mkl_dnn/src/common -isystem external/mkl_dnn/src/cpu -isystem bazel-out/host/bin/external/mkl_dnn/src/cpu -isystem external/mkl_dnn/src/cpu/gemm -isystem bazel-out/host/bin/external/mkl_dnn/src/cpu/gemm -isystem external/mkl_dnn/src/cpu/xbyak -isystem bazel-out/host/bin/external/mkl_dnn/src/cpu/xbyak -Wno-builtin-macro-redefined '-D__DATE="redacted"' '-D__TIMESTAMP="redacted"' '-D__TIME
="redacted"' -fPIC -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -Wall -fno-omit-frame-pointer -no-canonical-prefixes -fno-canonical-system-headers -DNDEBUG -g0 -O2 -ffunction-sections -fdata-sections -g0 -w -Wno-sign-compare -g0 '-std=c++14' -DEIGEN_AVOID_STL_ARRAY -Iexternal/gemmlowp -Wno-sign-compare '-ftemplate-depth=900' -fno-exceptions '-DGOOGLE_CUDA=1' '-DTENSORFLOW_USE_NVCC=1' -msse3 -pthread '-DGOOGLE_CUDA=1' -c tensorflow/core/kernels/linalg/matrix_square_root_op.cc -o bazel-out/host/bin/tensorflow/core/kernels/linalg/_objs/matrix_square_root_op/matrix_square_root_op.pic.o)
Execution platform: @local_execution_config_platform//:platform
x86_64-linux-gnu-gcc-7: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See file:///usr/share/doc/gcc-7/README.Bugs for instructions.
Target //tensorflow/tools/pip_package:build_pip_package failed to build
ERROR: /usr/local/src/tensorflow/tensorflow/tools/pip_package/BUILD:69:10 C++ compilation of rule '//tensorflow/core/kernels/linalg:matrix_square_root_op' failed (Exit 4): crosstool_wrapper_driver_is_not_gcc failed: error executing command
(cd /root/.cache/bazel/bazel_root/bbcc73fcc5c2b01ab08b6bcf7c29e42e/execroot/org_tensorflow &&
exec env -
LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64
PATH=/root/.cache/bazelisk/downloads/bazelbuild/bazel-3.7.2-linux-x86_64/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/proc/self/cwd
external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -MD -MF bazel-out/host/bin/tensorflow/core/kernels/linalg/objs/matrix_square_root_op/matrix_square_root_op.pic.d '-frandom-seed=bazel-out/host/bin/tensorflow/core/kernels/linalg/objs/matrix_square_root_op/matrix_square_root_op.pic.o' -DTENSORFLOW_USE_CUSTOM_CONTRACTION_KERNEL -DTENSORFLOW_USE_MKLDNN_CONTRACTION_KERNEL -DHAVE_SYS_UIO_H -DTF_USE_SNAPPY -DCURL_STATICLIB -DPLATFORM_LINUX -DENABLE_CURL_CLIENT -DOPENSSL_IS_BORINGSSL -DEIGEN_MPL2_ONLY '-DEIGEN_MAX_ALIGN_BYTES=64' '-DEIGEN_HAS_TYPE_TRAITS=0' -D__CLANG_SUPPORT_DYN_ANNOTATION -iquote . -iquote bazel-out/host/bin -iquote external/com_google_absl -iquote bazel-out/host/bin/external/com_google_absl -iquote external/nsync -iquote bazel-out/host/bin/external/nsync -iquote external/eigen_archive -iquote bazel-out/host/bin/external/eigen_archive -iquote external/gif -iquote bazel-out/host/bin/external/gif -iquote external/libjpeg_turbo -iquote bazel-out/host/bin/external/libjpeg_turbo -iquote external/com_google_protobuf -iquote bazel-out/host/bin/external/com_google_protobuf -iquote external/com_googlesource_code_re2 -iquote bazel-out/host/bin/external/com_googlesource_code_re2 -iquote external/farmhash_archive -iquote bazel-out/host/bin/external/farmhash_archive -iquote external/fft2d -iquote bazel-out/host/bin/external/fft2d -iquote external/highwayhash -iquote bazel-out/host/bin/external/highwayhash -iquote external/zlib -iquote bazel-out/host/bin/external/zlib -iquote external/local_config_cuda -iquote bazel-out/host/bin/external/local_config_cuda -iquote external/local_config_tensorrt -iquote bazel-out/host/bin/external/local_config_tensorrt -iquote external/double_conversion -iquote bazel-out/host/bin/external/double_conversion -iquote external/snappy -iquote bazel-out/host/bin/external/snappy -iquote external/curl -iquote bazel-out/host/bin/external/curl -iquote external/boringssl -iquote bazel-out/host/bin/external/boringssl -iquote external/jsoncpp_git -iquote bazel-out/host/bin/external/jsoncpp_git -iquote external/aws -iquote bazel-out/host/bin/external/aws -iquote external/aws-c-common -iquote bazel-out/host/bin/external/aws-c-common -iquote external/aws-c-event-stream -iquote bazel-out/host/bin/external/aws-c-event-stream -iquote external/aws-checksums -iquote bazel-out/host/bin/external/aws-checksums -iquote external/mkl_dnn -iquote bazel-out/host/bin/external/mkl_dnn -Ibazel-out/host/bin/external/local_config_cuda/cuda/virtual_includes/cuda_headers_virtual -Ibazel-out/host/bin/external/local_config_tensorrt/virtual_includes/tensorrt_headers -Ibazel-out/host/bin/external/local_config_cuda/cuda/virtual_includes/cudnn_header -Ibazel-out/host/bin/external/local_config_cuda/cuda/virtual_includes/cublas_headers_virtual -Ibazel-out/host/bin/external/local_config_cuda/cuda/virtual_includes/cusolver_headers_virtual -isystem external/nsync/public -isystem bazel-out/host/bin/external/nsync/public -isystem external/eigen_archive -isystem bazel-out/host/bin/external/eigen_archive -isystem external/gif -isystem bazel-out/host/bin/external/gif -isystem external/com_google_protobuf/src -isystem bazel-out/host/bin/external/com_google_protobuf/src -isystem external/farmhash_archive/src -isystem bazel-out/host/bin/external/farmhash_archive/src -isystem external/zlib -isystem bazel-out/host/bin/external/zlib -isystem external/local_config_cuda/cuda -isystem bazel-out/host/bin/external/local_config_cuda/cuda -isystem external/local_config_cuda/cuda/cuda/include -isystem bazel-out/host/bin/external/local_config_cuda/cuda/cuda/include -isystem external/double_conversion -isystem bazel-out/host/bin/external/double_conversion -isystem external/curl/include -isystem bazel-out/host/bin/external/curl/include -isystem external/boringssl/src/include -isystem bazel-out/host/bin/external/boringssl/src/include -isystem external/jsoncpp_git/include -isystem bazel-out/host/bin/external/jsoncpp_git/include -isystem external/aws/aws-cpp-sdk-core/include -isystem bazel-out/host/bin/external/aws/aws-cpp-sdk-core/include -isystem external/aws/aws-cpp-sdk-s3/include -isystem bazel-out/host/bin/external/aws/aws-cpp-sdk-s3/include -isystem external/aws/aws-cpp-sdk-transfer/include -isystem bazel-out/host/bin/external/aws/aws-cpp-sdk-transfer/include -isystem external/aws-c-common/include -isystem bazel-out/host/bin/external/aws-c-common/include -isystem external/aws-c-event-stream/include -isystem bazel-out/host/bin/external/aws-c-event-stream/include -isystem external/aws-checksums/include -isystem bazel-out/host/bin/external/aws-checksums/include -isystem external/local_config_cuda/cuda/cublas/include -isystem bazel-out/host/bin/external/local_config_cuda/cuda/cublas/include -isystem external/local_config_cuda/cuda/cusolver/include -isystem bazel-out/host/bin/external/local_config_cuda/cuda/cusolver/include -isystem external/mkl_dnn/include -isystem bazel-out/host/bin/external/mkl_dnn/include -isystem external/mkl_dnn/src -isystem bazel-out/host/bin/external/mkl_dnn/src -isystem external/mkl_dnn/src/common -isystem bazel-out/host/bin/external/mkl_dnn/src/common -isystem external/mkl_dnn/src/cpu -isystem bazel-out/host/bin/external/mkl_dnn/src/cpu -isystem external/mkl_dnn/src/cpu/gemm -isystem bazel-out/host/bin/external/mkl_dnn/src/cpu/gemm -isystem external/mkl_dnn/src/cpu/xbyak -isystem bazel-out/host/bin/external/mkl_dnn/src/cpu/xbyak -Wno-builtin-macro-redefined '-D__DATE="redacted"' '-D__TIMESTAMP="redacted"' '-D__TIME
="redacted"' -fPIC -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -Wall -fno-omit-frame-pointer -no-canonical-prefixes -fno-canonical-system-headers -DNDEBUG -g0 -O2 -ffunction-sections -fdata-sections -g0 -w -Wno-sign-compare -g0 '-std=c++14' -DEIGEN_AVOID_STL_ARRAY -Iexternal/gemmlowp -Wno-sign-compare '-ftemplate-depth=900' -fno-exceptions '-DGOOGLE_CUDA=1' '-DTENSORFLOW_USE_NVCC=1' -msse3 -pthread '-DGOOGLE_CUDA=1' -c tensorflow/core/kernels/linalg/matrix_square_root_op.cc -o bazel-out/host/bin/tensorflow/core/kernels/linalg/_objs/matrix_square_root_op/matrix_square_root_op.pic.o)
Execution platform: @local_execution_config_platform//:platform
INFO: Elapsed time: 3964.637s, Critical Path: 1182.82s
INFO: 19166 processes: 8754 internal, 10412 local.
FAILED: Build did NOT complete successfully
FAILED: Build did NOT complete successfully
Command exited with non-zero status 1
0.30user 3.15system 1:06:07elapsed 0%CPU (0avgtext+0avgdata 12632maxresident)k
10141048inputs+40outputs (42403major+19178minor)pagefaults 0swaps

cudnn_tensorflow_opencv docker image does not detect GPU when running opencv dnn module

Hi, I need opencv's dnn module to construct a machine learning model and do a single forward pass. When I run the python code without cudnn_tensorflow_opencv docker image, the video I am processing takes about 45 seconds. When I use cudnn_tensorflow_opencv docker image it takes the same time. I can monitor through nvdia-smi the GPU activity. It is not used. Also opencv or anything else do not print any info about any found GPU. I use docker version > 20, I also used the flag --gpus all just in case. Nothing happens. How can I use the GPU with opencv's dnn module?

CUDA 11

Do you have CUDA 11 support avail?

Problem with jetsonnano-cuda_tensorflow_opencv:10.0_2.1.0_4.3.0-20200515

Thanks for providing these docker-images. I hoped getting around compiling opencv ;-)
I used the docker image jetsonnano-cuda_tensorflow_opencv:10.0_2.1.0_4.3.0-20200515

But when doing
import cv2 in python3.6 I get the error libnppc.so.10 not found.
I assume the base-nvidia docker has upgraded to jetpack 4.4 which uses cuda 10.2
(instead of jetpack 4.3 which uses cuda 10.0)

Might I be right?
Is the solution to use another base-container in Makefile/Dockerfile: instead of
nvcr.io/nvidia/deepstream-l4t:4.0.2-19.12-base resp nvcr.io/nvidia/l4t-base:r32.3.1?

I cannot figure out what to use instead as the nvcr-catalog is ... well I cannot find anything there :-(

Any help is appreciated.
Thanks

Error when running the container on

Hello, I am really thankful for the images and the prebuilt containers.

But I am encountering this error when trying to train a CV model with prebuilt container datamachines/cudnn_tensorflow_opencv:11.6.2_2.9.1_4.6.0-20220815 (this ERROR is logged on vertex AI custom job as I can't test the setup locally having no GPU on my local machine ๐Ÿ˜ข )

I was running the job with n1-standard-4 machine with NVIDIA_TESLA_K80 attached.

2023-02-11 18:02:27.461890: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64\n

2023-02-11 18:02:27.461987: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.\n

It seems like that the environment variable is not set correctly. But I thought the prebuilt container already take care of that.

Any help is really appreciated. Really thankful for your time!

UNRAID build: change webui password

Hi! Thanks for this.

I am wondering if you could help me find out how to change the password of the webui?

I've tried these things:

  • adding a new variable within the Unraid template
  • manually changing the value in /home/jupyter/.jupyter/jupyter_notebook_config.json

.. to no avail.

Any idea? Thanks

DockerFiles

Hi,
Thanks for providing the builds in hub.docker.
I'm looking to make some changes to one of the images to include NVDEC in the opencv build and be able to use cv2.cudacodec.createVideoReader
Are the original DockerFile available somewhere?
Thanks

Docker builds failing on Jetson NX

Hi, I'm not sure if this is meant to work on the NX too, or if you've only tested on Nanos?

I've cloned the repo to the NX, and tried building with

make tensorflow_opencv
and then
make cudnn_tensorflow_opencv

The first gave:

Step 24/42 : RUN curl -s -Lo /usr/local/bin/bazel https://github.com/bazelbuild/bazelisk/releases/download/v${LATEST_BAZELISK}/bazelisk-linux-amd64   && chmod +x /usr/local/bin/bazel   && mkdir -p /usr/local/src/tensorflow   && cd /usr/local/src   && wget -q --no-check-certificate -c https://github.com/tensorflow/tensorflow/archive/v${CTO_TENSORFLOW_VERSION}.tar.gz -O - | tar --strip-components=1 -xz -C /usr/local/src/tensorflow   && cd /usr/local/src/tensorflow   && fgrep _TF_MAX_BAZEL configure.py | grep '=' | perl -ne '$lb="'${LATEST_BAZEL}'";$brv=$1 if (m%\=\s+.([\d\.]+).$+%); sub numit{@g=split(m%\.%,$_[0]);return(1000000*$g[0]+1000*$g[1]+$g[2]);}; if (&numit($brv) > &numit($lb)) { print "$lb" } else {print "$brv"};' > .bazelversion   && bazel clean   && chmod +x /tmp/tf_build.sh   && time /tmp/tf_build.sh ${CTO_TF_CUDNN} ${CTO_TF_OPT}   && time ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg   && time pip3 install /tmp/tensorflow_pkg/tensorflow-*.whl   && rm -rf /usr/local/src/tensorflow /tmp/tensorflow_pkg /tmp/bazel_check.pl /tmp/tf_build.sh /tmp/hsperfdata_root /root/.cache/bazel /root/.cache/pip /root/.cache/bazelisk
 ---> Running in 3059bbf7dc96
/bin/sh: 1: bazel: Exec format error
The command '/bin/sh -c curl -s -Lo /usr/local/bin/bazel https://github.com/bazelbuild/bazelisk/releases/download/v${LATEST_BAZELISK}/bazelisk-linux-amd64   && chmod +x /usr/local/bin/bazel   && mkdir -p /usr/local/src/tensorflow   && cd /usr/local/src   && wget -q --no-check-certificate -c https://github.com/tensorflow/tensorflow/archive/v${CTO_TENSORFLOW_VERSION}.tar.gz -O - | tar --strip-components=1 -xz -C /usr/local/src/tensorflow   && cd /usr/local/src/tensorflow   && fgrep _TF_MAX_BAZEL configure.py | grep '=' | perl -ne '$lb="'${LATEST_BAZEL}'";$brv=$1 if (m%\=\s+.([\d\.]+).$+%); sub numit{@g=split(m%\.%,$_[0]);return(1000000*$g[0]+1000*$g[1]+$g[2]);}; if (&numit($brv) > &numit($lb)) { print "$lb" } else {print "$brv"};' > .bazelversion   && bazel clean   && chmod +x /tmp/tf_build.sh   && time /tmp/tf_build.sh ${CTO_TF_CUDNN} ${CTO_TF_OPT}   && time ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg   && time pip3 install /tmp/tensorflow_pkg/tensorflow-*.whl   && rm -rf /usr/local/src/tensorflow /tmp/tensorflow_pkg /tmp/bazel_check.pl /tmp/tf_build.sh /tmp/hsperfdata_root /root/.cache/bazel /root/.cache/pip /root/.cache/bazelisk' returned a non-zero code: 2
Makefile:202: recipe for target 'actual_build' failed
make[3]: *** [actual_build] Error 2
make[3]: Leaving directory '/home/chicken/datamachines/cuda_tensorflow_opencv'
Makefile:145: recipe for target 'build_prep' failed
make[2]: *** [build_prep] Error 2
make[2]: Leaving directory '/home/chicken/datamachines/cuda_tensorflow_opencv'
Makefile:138: recipe for target 'tensorflow_opencv-1.15.5_3.4.14' failed
make[1]: *** [tensorflow_opencv-1.15.5_3.4.14] Error 2
make[1]: Leaving directory '/home/chicken/datamachines/cuda_tensorflow_opencv'
Makefile:132: recipe for target 'tensorflow_opencv' failed
make: *** [tensorflow_opencv] Error 2

And the second gave:

Step 4/43 : RUN apt-get update -y --fix-missing  && apt-get install -y --no-install-recommends     apt-utils     locales     wget     ca-certificates   && apt-get clean
 ---> Running in 52221b94691a
standard_init_linux.go:211: exec user process caused "exec format error"
The command '/bin/sh -c apt-get update -y --fix-missing  && apt-get install -y --no-install-recommends     apt-utils     locales     wget     ca-certificates   && apt-get clean' returned a non-zero code: 1
Makefile:202: recipe for target 'actual_build' failed
make[3]: *** [actual_build] Error 1
make[3]: Leaving directory '/home/chicken/datamachines/cuda_tensorflow_opencv'
Makefile:145: recipe for target 'build_prep' failed
make[2]: *** [build_prep] Error 2
make[2]: Leaving directory '/home/chicken/datamachines/cuda_tensorflow_opencv'
Makefile:141: recipe for target 'cudnn_tensorflow_opencv-9.2_1.15.5_3.4.14' failed
make[1]: *** [cudnn_tensorflow_opencv-9.2_1.15.5_3.4.14] Error 2
make[1]: Leaving directory '/home/chicken/datamachines/cuda_tensorflow_opencv'
Makefile:135: recipe for target 'cudnn_tensorflow_opencv' failed
make: *** [cudnn_tensorflow_opencv] Error 2

Did I miss a step?

Building Legacy Tensorflow

Is it possible to build with an older version of tensorflow 1.14.0?
I changed the stable version of TF to this ^, but it fails to build and complains

You have bazel 4.1.0 installed.
Please downgrade your bazel installation to version 0.25.2 or lower to build TensorFlow! To downgrade: download the installer for the old version
(from https://github.com/bazelbuild/bazel/releases) then run the installer.
-- bazel command to run:
bazel build --verbose_failures --config=opt --config=v1 --config=cuda //tensorflow/tools/pip_package:build_pip_package
Starting local Bazel server and connecting to it...

Changing the LATEST_BAZEL version in the makefile doesn't seem to work.

Unraid python3 packages on restart

Pardon my ignorance.

I have to pip install some packages for my use case. As some of you know, after a docker restart the packages are not there anymore and have to be reinstalled.

Is there a way to keep my installed packages (apt-get or pip) to remain persistent on docker reboot?

Thank you!

Docker build fail for Jetson Nano 20210218

Hello,

My docker build fail at step 27/41

 ---> Running in f4cf6fcad4d7
Traceback (most recent call last):
  File "setup.py", line 12, in <module>
    import torch
  File "/usr/local/lib/python3.6/dist-packages/torch/__init__.py", line 189, in <module>
    _load_global_deps()
  File "/usr/local/lib/python3.6/dist-packages/torch/__init__.py", line 142, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/usr/lib/python3.6/ctypes/__init__.py", line 348, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libcurand.so.10: cannot open shared object file: No such file or directory
The command '/bin/sh -c mkdir -p /usr/local/src/torchvision   && wget -q --no-check-certificate https://github.com/pytorch/vision/archive/v0.8.2.tar.gz -O - | tar --strip-components=1 -xz -C /usr/local/src/torchvision   && cd /usr/local/src/torchvision   && python3 setup.py install    && rm -rf /root/.cache/pip /usr/local/src/torchvision' returned a non-zero code: 1
Makefile:95: recipe for target 'actual_build' failed
make[2]: *** [actual_build] Error 1
make[2]: Leaving directory '/home/lisalink/cuda_tensorflow_opencv/JetsonNano'
Makefile:68: recipe for target 'build_prep' failed
make[1]: *** [build_prep] Error 2
make[1]: Leaving directory '/home/lisalink/cuda_tensorflow_opencv/JetsonNano'
Makefile:65: recipe for target 'jetsonnano-cuda_tensorflow_opencv-10.2_2.3_4.5.1' failed
make: *** [jetsonnano-cuda_tensorflow_opencv-10.2_2.3_4.5.1] Error 2

I installed Jetpack 4.5.0 without doing anything else. Not even apt update or apt upgrade
Please can you help with that issue ?

Problem on making tensorflow work with gpu (for 10.2_2.1.0_4.3.0-20200423)

First, Thanks for helping me out last time.

On working with "datamachines/cudnn_tensorflow_opencv:10.2_2.1.0_4.3.0-20200423" image, I have no problem on enabling cuda support but when I try to use Tensorflow with gpu I have issue of not able to detect my GPU i.e. when I enter "import tensorflow as tf;print(len(tf.config.experimental.list_physical_devices('GPU')))" returns me 0

Is there a way to fix it or do I need to download another image with cuda10.1?

Please help me out

If possible also please mension a way to install darknetpy in any of the image (I think it will be very good enhanceement for ML docker images like this)

Unable to EXPOSE port

after running the image with cudnn I run the command "jupyter-notebook --ip=0.0.0.0 --port=8888 --no-browser --allow-root" but I am not able to open the notebook on crome. Please help me out.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.