Code Monkey home page Code Monkey logo

Comments (6)

titu1994 avatar titu1994 commented on July 28, 2024

@kronion Is your keras img_dim_ordering = "tf" ? Check in the .keras json file.

Also, this was on CPU? How much time did it actually take? I assume it took an enormous amount of time.

Edit:
This looks like a case of using tensorflow backend with image_dim_ordering as "th". Because of this the theano weights are being loaded with tensorflow backend (theano convolutional kernels need to be flipped before being used in tensorflow, if it isn't flipped then this is usually the result).

from neural-style-transfer.

kronion avatar kronion commented on July 28, 2024
{
    "floatx": "float32",
    "backend": "tensorflow",
    "epsilon": 1e-07,
    "image_dim_ordering": "tf"
}

on both machines.

And yes, it did take an enormous amount of time. I was able to do 25 epochs in about 6 hours, and I cut it off there.

from neural-style-transfer.

kronion avatar kronion commented on July 28, 2024

Here are the logs from running ten epochs on my GPU. I thought the memory warnings could have something to do with the problems I'm experiencing, which I why I waited to reproduce them on a CPU before posting.

$ python Network.py images/inputs/content/sagano_bamboo_forest.jpg images/inputs/style/patterned_leaves.jpg images/output/test4

Using TensorFlow backend.
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: 
name: GeForce GTX 670
major: 3 minor: 0 memoryClockRate (GHz) 0.98
pciBusID 0000:01:00.0
Total memory: 2.00GiB
Free memory: 1.78GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:839] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 670, pci bus id: 0000:01:00.0)
Model loaded.
Start of iteration 1
W tensorflow/core/common_runtime/bfc_allocator.cc:213] Ran out of memory trying to allocate 1.11GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
W tensorflow/core/common_runtime/bfc_allocator.cc:213] Ran out of memory trying to allocate 1.11GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
W tensorflow/core/common_runtime/bfc_allocator.cc:213] Ran out of memory trying to allocate 2.15GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
W tensorflow/core/common_runtime/bfc_allocator.cc:213] Ran out of memory trying to allocate 1.09GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
W tensorflow/core/common_runtime/bfc_allocator.cc:213] Ran out of memory trying to allocate 2.15GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
W tensorflow/core/common_runtime/bfc_allocator.cc:213] Ran out of memory trying to allocate 1.09GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
W tensorflow/core/common_runtime/bfc_allocator.cc:213] Ran out of memory trying to allocate 2.15GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
W tensorflow/core/common_runtime/bfc_allocator.cc:213] Ran out of memory trying to allocate 1.09GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
W tensorflow/core/common_runtime/bfc_allocator.cc:213] Ran out of memory trying to allocate 2.15GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
W tensorflow/core/common_runtime/bfc_allocator.cc:213] Ran out of memory trying to allocate 1.09GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
Network.py:352: RuntimeWarning: invalid value encountered in double_scalars
  improvement = (prev_min_val - min_val) / prev_min_val * 100
Current loss value: 1.00896e+08  Improvement : nan %
Rescaling Image to (1080, 1920)
Image saved as images/output/test4_at_iteration_1.png
Iteration 1 completed in 23s
Start of iteration 2
Current loss value: 3.24734e+07  Improvement : 67.815 %
Rescaling Image to (1080, 1920)
Image saved as images/output/test4_at_iteration_2.png
Iteration 2 completed in 21s
Start of iteration 3
Current loss value: 1.9547e+07  Improvement : 39.806 %
Rescaling Image to (1080, 1920)
Image saved as images/output/test4_at_iteration_3.png
Iteration 3 completed in 20s
Start of iteration 4
Current loss value: 1.44546e+07  Improvement : 26.052 %
Rescaling Image to (1080, 1920)
Image saved as images/output/test4_at_iteration_4.png
Iteration 4 completed in 21s
Start of iteration 5
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:244] PoolAllocator: After 21978 get requests, put_count=21977 evicted_count=1000 eviction_rate=0.0455021 and unsatisfied allocation rate=0.0500956
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:256] Raising pool_size_limit_ from 100 to 110
Current loss value: 1.18752e+07  Improvement : 17.845 %
Rescaling Image to (1080, 1920)
Image saved as images/output/test4_at_iteration_5.png
Iteration 5 completed in 20s
Start of iteration 6
Current loss value: 9.70512e+06  Improvement : 18.274 %
Rescaling Image to (1080, 1920)
Image saved as images/output/test4_at_iteration_6.png
Iteration 6 completed in 21s
Start of iteration 7
Current loss value: 8.33909e+06  Improvement : 14.075 %
Rescaling Image to (1080, 1920)
Image saved as images/output/test4_at_iteration_7.png
Iteration 7 completed in 20s
Start of iteration 8
Current loss value: 7.34698e+06  Improvement : 11.897 %
Rescaling Image to (1080, 1920)
Image saved as images/output/test4_at_iteration_8.png
Iteration 8 completed in 21s
Start of iteration 9
Current loss value: 6.64953e+06  Improvement : 9.493 %
Rescaling Image to (1080, 1920)
Image saved as images/output/test4_at_iteration_9.png
Iteration 9 completed in 21s
Start of iteration 10
Current loss value: 5.85107e+06  Improvement : 12.008 %
Rescaling Image to (1080, 1920)
Image saved as images/output/test4_at_iteration_10.png
Iteration 10 completed in 21s

from neural-style-transfer.

titu1994 avatar titu1994 commented on July 28, 2024

@kronion Hmm that is weird. The code does handle all the tensorflow differences properly (Network.py is same as the keras example, just with the variables exposed as arguments). See https://github.com/fchollet/keras/blob/master/examples/neural_style_transfer.py

Since I am on Windows, I can't use the Tensorflow backend to check. Since the original script has been tested on both, I assume Network.py should produce exact same results. It's loading the same weights and the same models as well so I don't understand whats going wrong.

Can I bother you to run the original script and see if the results are still wrong?

from neural-style-transfer.

kronion avatar kronion commented on July 28, 2024

Weird, using Theano worked. So maybe TF support is buggy in the original implementation? Perhaps this line is the hint, I don't see it when I run with Theano:

Network.py:352: RuntimeWarning: invalid value encountered in double_scalars

That or I'm not installing Tensorflow correctly...

from neural-style-transfer.

titu1994 avatar titu1994 commented on July 28, 2024

@kronion Have you tested it using the original script? That error comes after the image has already been created and the loss value has been returned. It's throwing an error during calculation of "improvement", which it doesn't throw when using theano.

from neural-style-transfer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.