I have a quick question. I was thinking of writing a wrapper for <a href="https://gith

Just a question about style-transfer HOT 16 OPEN

fzliu commented on August 22, 2024

Just a question

from style-transfer.

Comments (16)

TomArrow commented on August 22, 2024 1

I wanted to try this progressive upscaling, but I see no development branch, only the gram-thing. Is it still possible to get this?

from style-transfer.

fzliu commented on August 22, 2024

Thanks for your interest in the code! I've actually recently been looking into optimizing the pipeline as much as possible, and L-BFGS was one of the first things I looked at. I did some benchmarks using a max length of 500 for input images - it turns out that that around 90% of the time is spent in the forward-backward steps for VGG. You'll probably be able to squeeze out a bit of extra performance by moving the loss minimization step to the GPU, but my guess is that it'll be negligible unless you're working with large images.

With that being said, I'm about to push some code to the develop branch, which starts with a small image and progressively scales it up, using the output of the smaller inputs initialize the next minimization pass. The total runtime here is around 10 minutes on my CPU and the results are comparable to just doing a standard pass (which takes over 2 hours) for 500 iterations.

I haven't tried it on GPU because my poor 750M doesn't have enough memory to support the VGG model. If you have a K40/K80 or Titan X/Z, feel free to drop some performance numbers here - I'm curious to see how fast it will run on a state-of-the-art GPU.

from style-transfer.

commented on August 22, 2024

Yeah totally, I have a titan X. Update the develop branch and I'll run some tests tomorrow. Also I just updated my master branch and I noticed the "-g 0" flag is not working anymore. The older version of the code runs fine on the gpu. I just thought to give you a heads up. That's strange. I'll try to debug that tomorrow too.

from style-transfer.

fzliu commented on August 22, 2024

Just pushed to develop - results might require some tuning but they should be decent for now. And yup, there are unfortunately some issues with the master branch right now due to a merge from a couple of days ago, but I'm holding off on any further changes to master until the runtime gets into a more reasonable range.

from style-transfer.

commented on August 22, 2024

I ran some tests.

At 1920 x 1080 pixels, using around 9Gb GPU memory:

style.py::09:43:53.369 -- Starting style transfer.
style.py::09:43:54.560 -- Running net on GPU 0.
style.py::09:43:55.576 -- Successfully loaded images.
style.py::09:44:00.524 -- Successfully loaded model vgg.
style.py::09:44:00.524 -- Minimization pass 1 of 3.
Optimizing: 100% ||||||||||||||||||||||||||||||||||||||||||||||||| Time: 0:07:45
style.py::09:51:46.582 -- Ran 513 iterations in 466s.
style.py::09:51:46.587 -- Minimization pass 2 of 3.
Optimizing: 100% ||||||||||||||||||||||||||||||||||||||||||||||||| Time: 0:07:14
style.py::09:59:03.202 -- Ran 129 iterations in 437s.
style.py::09:59:03.222 -- Minimization pass 3 of 3.
Optimizing: 100% ||||||||||||||||||||||||||||||||||||||||||||||||| Time: 0:08:13
style.py::10:07:24.604 -- Ran 33 iterations in 501s.
/usr/local/lib/python2.7/dist-packages/skimage/util/dtype.py:111: UserWarning: Possible precision loss when converting from float32 to uint8
"%s to %s" % (dtypeobj_in, dtypeobj))
23 minutes and 45 seconds elapsed.

At 524 pixels:

style.py::10:26:58.633 -- Starting style transfer.
style.py::10:26:58.805 -- Running net on GPU 0.
style.py::10:26:59.710 -- Successfully loaded images.
style.py::10:27:00.893 -- Successfully loaded model vgg.
style.py::10:27:00.893 -- Minimization pass 1 of 3.
Optimizing: 100% ||||||||||||||||||||||||||||||||||||||||||||||||| Time: 0:01:56
style.py::10:28:57.870 -- Ran 513 iterations in 117s.
style.py::10:28:57.871 -- Minimization pass 2 of 3.
Optimizing: 100% ||||||||||||||||||||||||||||||||||||||||||||||||| Time: 0:00:49
style.py::10:29:47.980 -- Ran 129 iterations in 50s.
style.py::10:29:47.982 -- Minimization pass 3 of 3.
Optimizing: 100% ||||||||||||||||||||||||||||||||||||||||||||||||| Time: 0:00:39
style.py::10:30:28.202 -- Ran 33 iterations in 40s.
/usr/local/lib/python2.7/dist-packages/skimage/util/dtype.py:111: UserWarning: Possible precision loss when converting from float32 to uint8
"%s to %s" % (dtypeobj_in, dtypeobj))
3 minutes and 30 seconds elapsed.

Before the change, a 524px image took 10 minutes, now down to 3:30. And a 1920px took 2 or so hours, now down to 23 minutes. It's an interesting trick you did. Results are close, but some areas are messed up. Are the torch implementations doing the same thing? And also, what exactly are you doing, running a number of iterations on a smaller scale, then scaling up and running another pass?

from style-transfer.

fzliu commented on August 22, 2024

Yeah - I do an initial set of updates at a smaller scale, then scale the image up before running more iterations. I'd expect the results for large images to be a bit messed up without some tuning, though.

I'm surprised that it still takes so long on the GPU; I expected runtimes of <1 minute on the GPU for length ~500 images. My guess is that this is due to the synchronization between the CPU and GPU for each of the 6 VGG layers being used. I knew that there would be overhead here, but I didn't realize that it would be that much. A torch-based implementation wouldn't need this synchronization step, since you can add custom loss layers on the fly and just do the entire forward-backward pass on the GPU. I'm sure this could be done in Caffe as well, but it would require injecting some custom C++ code for the style loss and building separate "datasets" for each pair of input content and style images, among other things. With that being said, the only two torch implementations I'm familiar with are Kai Sheng's and Justin's. I don't think either of those implementations are doing what I'm doing here, but I'd expect them to run faster with this progressive upscaling.

Could you post some example outputs? For reference, I just tried one of Justin's images and got these results for progressive upscaling:

versus standard:

from style-transfer.

commented on August 22, 2024

Sometimes the results are even better.

This is with no progressive up-scaling:

This is with progressive up-scaling:

Both are with the same settings though. Playing with -r does make a difference. I'll run a big image and post it as soon as I can.

from style-transfer.

commented on August 22, 2024

Also have you tried the normalized VGG model?

from style-transfer.

commented on August 22, 2024

Here's a an HD image!

Progressive up-scaling @ 21 minutes:

No progressive up-scaling @ a painful 138 minutes:

I don't have a super fast CPU: Intel® Core™ i7-3820 CPU @ 3.60GHz so the sync is probably slowing me down. The progressive up-scaling is a great idea though, very clever. Whenever you need GPU benchmarks, specify the test and I’ll run it for you. I also think you should add a form of tv_loss like in Justin's or Kai's. It would greatly improve image quality.

from style-transfer.

fzliu commented on August 22, 2024

Awesome results - thanks for generating them! TV denoising (or some other type of smoothing) is on my TODO list. I haven't tried the normalized model, but I think the standard VGG models should be able to generate better results.

As always, feel free to suggest further improvements and/or optimizations.

from style-transfer.

commented on August 22, 2024

Hi again. I just ran your old develop branch code on a fedora. Do you know if there is any reason why it doesn't multi-thread? It's slow on fedora.

from style-transfer.

fzliu commented on August 22, 2024

Sorry for the late reply (been really busy at work lately). I don't think the code is multi-threaded by default, unless you have MKL installed. You're running it on the CPU, correct?

from style-transfer.

dpaiton commented on August 22, 2024

Just an update, I tested the gpu flag and it seems to be working fine on my end, not sure if you're still having this issue. Also, as far as the normalized VGG model goes, I have tried it but I was not able to get it to work as well as I wanted. Rescaling the ratio parameter does not seem to have an effect. It might be an issue with my model.

from style-transfer.

commented on August 22, 2024

I had messed up my environment variables. Obviously the code is multi-threaded on CPU mode, if you set the env variables up correctly for mkl or openblas. Weirdly enough, on GPU mode (-g 0), it is also multi-threaded for me, but only with openblas, and only on ubuntu. I don't know why.

from style-transfer.

fzliu commented on August 22, 2024

Hmm, this seems strange, but it seems more related to Caffe than anything else. What is the Caffe commit hash you're using?

from style-transfer.

commented on August 22, 2024

I don't know the hash. I didn't download through git. All I know is that it's from Aug 27-28 2015. I can send you a zip if you want.

from style-transfer.

Just a question about style-transfer HOT 16 OPEN

Comments (16)

At 1920 x 1080 pixels, using around 9Gb GPU memory:

At 524 pixels:

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent