Code Monkey home page Code Monkey logo

Comments (7)

gtoderici avatar gtoderici commented on June 14, 2024

FYI -- it's likely a cuda issue.

Try the following code. If you see any errors, it's likely a bad configuration of cuda/cuDNN. Notice the "correct" output below (in my case I have a Quadro + K40, but only the K40 is listed):

$ ipython
In [1]: import tensorflow as tf
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so locally

In [2]: tf.test.device_lib.list_local_devices()
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:951] Found device 0 with properties:
name: Tesla K40c
major: 3 minor: 5 memoryClockRate (GHz) 0.745
pciBusID 0000:05:00.0
Total memory: 11.17GiB
Free memory: 11.10GiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:572] creating context when one is currently active; existing: 0x21973b0
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:951] Found device 1 with properties:
name: Quadro K600
major: 3 minor: 0 memoryClockRate (GHz) 0.8755
pciBusID 0000:04:00.0
Total memory: 964.81MiB
Free memory: 460.50MiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:855] cannot enable peer access from device ordinal 0 to device ordinal 1
I tensorflow/core/common_runtime/gpu/gpu_device.cc:855] cannot enable peer access from device ordinal 1 to device ordinal 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:972] DMA: 0 1
I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 0: Y N
I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 1: N Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K40c, pci bus id: 0000:05:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1028] Ignoring gpu device (device: 1, name: Quadro K600, pci bus id: 0000:04:00.0) with Cuda multiprocessor count: 1. The minimum required count is 8. You can adjust this requirement with the env var TF_MIN_GPU_MULTIPROCESSOR_COUNT.
Out[2]:
[name: "/cpu:0"
device_type: "CPU"
memory_limit: 268435456
bus_adjacency: BUS_ANY
incarnation: 4541115684887390268
,
name: "/gpu:0"
device_type: "GPU"
memory_limit: 11323454260
incarnation: 7838963373212564962
physical_device_desc: "device: 0, name: Tesla K40c, pci bus id: 0000:05:00.0"
]

What fixed it for me (to get the above output), was:

$ sudo apt-get install nvidia-modprobe

(I'm assuming you're running on ubuntu)

from neural-style-tf.

krystophv avatar krystophv commented on June 14, 2024

I'm also getting the Shape(...) must have rank 1 error, raised with a similar stack trace (line 361 in gram_matrix being the last line before library code). I have checked python's ability to load tensorflow and do some simple examples, so I don't think it's a CUDA misconfiguration.

Ubuntu 16.04
Tensorflow 0.9.0
OpenCV 2.14.3
CUDA 8.0
cuDNN 5.1
Python 2.7.12

from neural-style-tf.

cysmith avatar cysmith commented on June 14, 2024

I'm not getting this error so I'm kind of shooting in the dark.

Try changing line 361 from F = tf.reshape(x[0], (area, depth)) to F = tf.reshape(x, (area, depth))

Please let me know by email ([email protected]) if that works as soon as you can.

from neural-style-tf.

krystophv avatar krystophv commented on June 14, 2024

Well, after dropping the comment I decided to go ahead and try compiling tensorflow at the latest stable release (0.10.0) and that seems to have solved the issue (potentially). It's actually gotten into the L-BFGS code where it was erroring out well before that previously. It hasn't successfully finished (and won't for a little while - 750M isn't the fastest GPU on the block), but I'm hopeful.

from neural-style-tf.

krystophv avatar krystophv commented on June 14, 2024

Since that's all I changed and it seems to work, that's the inference I would make. I don't really have the knowledge set to be able to say with confidence that there's a fix for the issue between 0.9.0 and 0.10.0, but it seems to be working for me.

from neural-style-tf.

cysmith avatar cysmith commented on June 14, 2024

Ok. I pushed the change of line 361 from F = tf.reshape(x[0], (area, depth)) to F = tf.reshape(x, (area, depth)). Both work on my machine but if anyone continues to encounter this problem please let me know.

from neural-style-tf.

xinghedyc avatar xinghedyc commented on June 14, 2024

@cysmith thanks a lot, It works for me.
environment:
ubuntu16.04
cuda 8.0rc
tensorflow 0.10
cuDNN 5
python 2.7

from neural-style-tf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.