I find prisma seems to preserve more details when do style transfer. Following is an e

Wow, nice work <a class="user-mention notranslate" data-hovercard-type="user" data-hov

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

prisma seems to preserve more detail about fast-neural-style HOT 16 OPEN

jcjohnson commented on September 7, 2024

prisma seems to preserve more detail

from fast-neural-style.

Comments (16)

htoyryla commented on September 7, 2024 5

I repeated the training with content_weight=1, style_weight=3, up to 40k iterations. The full command was:

th train.lua -h5_file /work/hplaces256.h5 -style_image /home/hannu/Downloads/udnie.jpg -checkpoint_name udnie-hplaces256b -style_weights 3.0 -content_weights 1.0 -gpu 0 -arch c9s1-16,d32,d64,R64,R64,R64,R64,R64,U2,U2,c9s1-3

Here's the resulting image. I then wrote a script to copy original colors to an image (https://gist.github.com/htoyryla/147f641f2203ad01b040f4b568e98260) and using it made the second image.

I think it would be possible to get even closer to the Prisma look by still finetuning the weights, and to get a touch of detail, blend the different channels of the original image in a suitable mix into resulting image, instead of simply copying the color information.

Output from fast_neural_style.lua:

Output from original_colors.lua:

from fast-neural-style.

htoyryla commented on September 7, 2024

I guess Prisma is copying color from the original image here, which could explain some of the differences.

Somehow I feel that while both results are interesting, neither really captures much of the original style. Fast-neural-style fills the picture with a colored mesh which indeed captures the forms in the content image, using colors from the style image but not really resembling the shapes, their scale and feeling of the original style.

I may be mistaken but I think the iterative neural-style was better in real style transfer. Fast-neural-style is a great tool for creating styles but these styles tend to look very different from the originals. I have had similar experiences with texture_nets, with which I experimented for days trying get to the style reproduced in more or less the original scale, until I gave up and moved to something else. I have not yet tried the same with fast-neural-style.

By the way, it looks to me like the new style transfer methods can easily fill the canvas with decorating stylistic details, but the opposite, simplifying, is difficult. And yet, much of art is about simplifying what you see and capturing it in an image. Prisma, in this example I think, is closer, but not exactly what I am after.

PS. I think that the fast_neural_style result uses quite a lot of style weight. I am training just now with content weight 1 and style weight 5 and the result looks much more like Prisma's, without the mesh in the sky, but more simple, without detail. Actually I am quite pleased with the result.

I am not using MSCOCO dataset but a set of 2500 of my own photos, mainly places and landscapes. The dataset seems to matter, it may be worth while trying a dataset with the kind of images one intends to use with the style.

This is after 8000 iterations, so still quite early. What I wrote above was based on even earlier iterations. I wonder if the mesh in the sky is growing with the iterations. The earlier snapshots were simpler, with a clear sky with some clouds. Now there are already signs of the colored mesh.

from fast-neural-style.

htoyryla commented on September 7, 2024

Here's the result from my newly trained model after 40k iterations. The "colored mesh" did not spread out throughout the sky, as I feared, but in fact retreated and the overall look improved. But still, it has almost no similarity to the original style.

In this image I'd point out the white "ghosts" behind the bridge and the buildings on the left. Here they blend quite well into the background but in my experiments I've seen images almost totally dominated by such "ghost" shapes, especially in the sky, and especially with higher style weights.

from fast-neural-style.

jcjohnson commented on September 7, 2024

Wow, nice work @htoyryla! I think your Udnie model is better than mine :)

In general I don't think that Prisma is doing anything fundamentally different from fast-neural-style; I think they have just spent a lot of time and effort carefully tuning the hyperparameters of their models to give nice effects. I think they also do some post-processing to blend the raw neural-net output with the content image; there are a lot of different ways to blend images, and I think they also tune the post-processing per style to make sure their results are nice.

from fast-neural-style.

htoyryla commented on September 7, 2024

"I think they also do some post-processing to blend the raw neural-net output with the content image; there are a lot of different ways to blend images, and I think they also tune the post-processing per style to make sure their results are nice."

That's exactly what I was thinking when I wrote about blending. I simply copied the Y channel, but if one wants a touch of detail then I think one could blend some of the other channels. And the optimum way to do this is likely to be specific to style.

I used my own dataset consisting of 2500 photos, places and landscapes. I have noticed that using it can give quite different results from mscoco. I'll check now the same training but using mscoco.

from fast-neural-style.

jcjohnson commented on September 7, 2024

I used my own dataset consisting of 2500 photos, places and landscapes. I have noticed that using it can give quite different results from mscoco. I'll check now the same training but using mscoco.

Interesting; I have only tried training with COCO but I'm pretty sure the training images are important. I think the number of training images is also important; In Dmitry's Instance Normalization paper (https://arxiv.org/abs/1607.08022) he mentions that his best results were trained with only 16 content images. I haven't done much experimentation with training sets, but this seems to be an important area for exploration.

from fast-neural-style.

htoyryla commented on September 7, 2024

Have now trained using COCO but otherwise the same parameters. Different but not too different. Looks almost as if I had used a bit higher style weight.

From fast-neural-style.lua:

After original_color.lua:

from fast-neural-style.

xlvector commented on September 7, 2024

The results looks awesome! Thanks!

I will try to do more works on post-processing.

from fast-neural-style.

xlvector commented on September 7, 2024

@jcjohnson I have tried to train with 200K images from coco, MIT space, and imageNet. Seems does not get better results. I will try this again later.

from fast-neural-style.

xlvector commented on September 7, 2024

@jcjohnson what is your parameter for the_wave style. I seems can not reproduce your results with parameter in print_options.lua.

Following is my result:

Your results

prisma

from fast-neural-style.

jcjohnson commented on September 7, 2024

My wave model does not use instance norm, so you should set -use_instance_norm 0 if you want to duplicate my results.

from fast-neural-style.

piteight commented on September 7, 2024

Is there a way to change parameters manually inside a model? Something like selecting weights that interests us, and changing their values? This script uses vgg16 model, but I also have not trained vgg19 model which gave me better results in slow neural style from fzliu. Can I switch to training vgg19 model with your script?
Here's my results in vgg16:)
trained image

th train.lua -h5_file ../Baz.h5 -style_image ../stasio.jpg -style_image_size 256 -content_weights 1.0 -style_weights 5.0 -checkpoint_name stasio -gpu 0 -use_cudnn 1 -backend cuda -batch_size 2 -checkpoint_every 100

th train.lua -h5_file ../Baz.h5 -style_image ../stasio.jpg -style_image_size 300 -content_weights 3.0 -style_weights 1.0 -checkpoint_name ztasio -gpu 0 -use_cudnn 1 -backend cuda -batch_size 2 -checkpoint_every 100 -arch c9s1-16,d32,d64,R64,R64,R64,R64,R64,U2,U2,c9s1-3

th train.lua -h5_file ../Baz.h5 -style_image ../stasio.jpg -style_image_size 300 -content_weights 3.0 -style_weights 5.0 -checkpoint_name Xtasio -gpu 0 -use_cudnn 1 -backend cuda -batch_size 2 -checkpoint_every 100 -arch c9s1-16,d32,d64,R64,R64,R64,R64,R64,U2,U2,c9s1-3 -max_train 1

th train.lua -h5_file ../Baz.h5 -style_image ../stasio.jpg -style_image_size 300 -content_weights 0.5 -style_weights 8.0 -checkpoint_name Ytasio -gpu 0 -use_cudnn 1 -backend cuda -batch_size 2 -checkpoint_every 100 -arch c9s1-16,d32,d64,R64,R64,R64,R64,R64,U2,U2,c9s1-3

The first model is the best, but It have some glitches that I want to erease in future training sessions.
Everything was trained with COCO 40k images.

from fast-neural-style.

htoyryla commented on September 7, 2024

I see you have copied the arch c9s1-16,d32,d64,R64,R64,R64,R64,R64,U2,U2,c9s1-3 from my comment. As commented by @jcjohnson in another thread, that might be a poor choice, and perhaps one should add a conv layer between the two U2 layers (even if for me it worked without).

from fast-neural-style.

piteight commented on September 7, 2024

Yes, I thought I would give it a try, to see different methods than only changing weights of style and content. The second version is quite good, because of the sky, except this noise pattern. It worket good for chicago.jpg,

but in example with person, the result was very poor:

first example gave me this output:

I will try putting the conv layer as You mentioned :)

from fast-neural-style.

universewill commented on September 7, 2024

@htoyryla can u make your dataset and pretrained models' parameters available?

from fast-neural-style.

htoyryla commented on September 7, 2024

@htoyryla can u make your dataset and pretrained models' parameters available?

After almost three years of doing other things, no, sorry.

from fast-neural-style.

prisma seems to preserve more detail about fast-neural-style HOT 16 OPEN

Comments (16)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent