Code Monkey home page Code Monkey logo

Comments (16)

htoyryla avatar htoyryla commented on September 7, 2024 5

I repeated the training with content_weight=1, style_weight=3, up to 40k iterations. The full command was:

th train.lua -h5_file /work/hplaces256.h5 -style_image /home/hannu/Downloads/udnie.jpg -checkpoint_name udnie-hplaces256b -style_weights 3.0 -content_weights 1.0 -gpu 0 -arch c9s1-16,d32,d64,R64,R64,R64,R64,R64,U2,U2,c9s1-3

Here's the resulting image. I then wrote a script to copy original colors to an image (https://gist.github.com/htoyryla/147f641f2203ad01b040f4b568e98260) and using it made the second image.

I think it would be possible to get even closer to the Prisma look by still finetuning the weights, and to get a touch of detail, blend the different channels of the original image in a suitable mix into resulting image, instead of simply copying the color information.

Output from fast_neural_style.lua:
undie_ny-test001z

Output from original_colors.lua:
t001z

from fast-neural-style.

htoyryla avatar htoyryla commented on September 7, 2024

I guess Prisma is copying color from the original image here, which could explain some of the differences.

Somehow I feel that while both results are interesting, neither really captures much of the original style. Fast-neural-style fills the picture with a colored mesh which indeed captures the forms in the content image, using colors from the style image but not really resembling the shapes, their scale and feeling of the original style.

I may be mistaken but I think the iterative neural-style was better in real style transfer. Fast-neural-style is a great tool for creating styles but these styles tend to look very different from the originals. I have had similar experiences with texture_nets, with which I experimented for days trying get to the style reproduced in more or less the original scale, until I gave up and moved to something else. I have not yet tried the same with fast-neural-style.

By the way, it looks to me like the new style transfer methods can easily fill the canvas with decorating stylistic details, but the opposite, simplifying, is difficult. And yet, much of art is about simplifying what you see and capturing it in an image. Prisma, in this example I think, is closer, but not exactly what I am after.

PS. I think that the fast_neural_style result uses quite a lot of style weight. I am training just now with content weight 1 and style weight 5 and the result looks much more like Prisma's, without the mesh in the sky, but more simple, without detail. Actually I am quite pleased with the result.

I am not using MSCOCO dataset but a set of 2500 of my own photos, mainly places and landscapes. The dataset seems to matter, it may be worth while trying a dataset with the kind of images one intends to use with the style.

This is after 8000 iterations, so still quite early. What I wrote above was based on even earlier iterations. I wonder if the mesh in the sky is growing with the iterations. The earlier snapshots were simpler, with a clear sky with some clouds. Now there are already signs of the colored mesh.
undie_ny-test000

from fast-neural-style.

htoyryla avatar htoyryla commented on September 7, 2024

Here's the result from my newly trained model after 40k iterations. The "colored mesh" did not spread out throughout the sky, as I feared, but in fact retreated and the overall look improved. But still, it has almost no similarity to the original style.

In this image I'd point out the white "ghosts" behind the bridge and the buildings on the left. Here they blend quite well into the background but in my experiments I've seen images almost totally dominated by such "ghost" shapes, especially in the sky, and especially with higher style weights.

undie_ny-test000h

from fast-neural-style.

jcjohnson avatar jcjohnson commented on September 7, 2024

Wow, nice work @htoyryla! I think your Udnie model is better than mine :)

In general I don't think that Prisma is doing anything fundamentally different from fast-neural-style; I think they have just spent a lot of time and effort carefully tuning the hyperparameters of their models to give nice effects. I think they also do some post-processing to blend the raw neural-net output with the content image; there are a lot of different ways to blend images, and I think they also tune the post-processing per style to make sure their results are nice.

from fast-neural-style.

htoyryla avatar htoyryla commented on September 7, 2024

"I think they also do some post-processing to blend the raw neural-net output with the content image; there are a lot of different ways to blend images, and I think they also tune the post-processing per style to make sure their results are nice."

That's exactly what I was thinking when I wrote about blending. I simply copied the Y channel, but if one wants a touch of detail then I think one could blend some of the other channels. And the optimum way to do this is likely to be specific to style.

I used my own dataset consisting of 2500 photos, places and landscapes. I have noticed that using it can give quite different results from mscoco. I'll check now the same training but using mscoco.

from fast-neural-style.

jcjohnson avatar jcjohnson commented on September 7, 2024

I used my own dataset consisting of 2500 photos, places and landscapes. I have noticed that using it can give quite different results from mscoco. I'll check now the same training but using mscoco.

Interesting; I have only tried training with COCO but I'm pretty sure the training images are important. I think the number of training images is also important; In Dmitry's Instance Normalization paper (https://arxiv.org/abs/1607.08022) he mentions that his best results were trained with only 16 content images. I haven't done much experimentation with training sets, but this seems to be an important area for exploration.

from fast-neural-style.

htoyryla avatar htoyryla commented on September 7, 2024

Have now trained using COCO but otherwise the same parameters. Different but not too different. Looks almost as if I had used a bit higher style weight.

From fast-neural-style.lua:

undie_ny-mscoco001z

After original_color.lua:

t002z

from fast-neural-style.

xlvector avatar xlvector commented on September 7, 2024

The results looks awesome! Thanks!

I will try to do more works on post-processing.

from fast-neural-style.

xlvector avatar xlvector commented on September 7, 2024

@jcjohnson I have tried to train with 200K images from coco, MIT space, and imageNet. Seems does not get better results. I will try this again later.

from fast-neural-style.

xlvector avatar xlvector commented on September 7, 2024

@jcjohnson what is your parameter for the_wave style. I seems can not reproduce your results with parameter in print_options.lua.

Following is my result:

free_coco_wave7

Your results

th_free_wave

prisma

free_wave

from fast-neural-style.

jcjohnson avatar jcjohnson commented on September 7, 2024

My wave model does not use instance norm, so you should set -use_instance_norm 0 if you want to duplicate my results.

from fast-neural-style.

piteight avatar piteight commented on September 7, 2024

Is there a way to change parameters manually inside a model? Something like selecting weights that interests us, and changing their values? This script uses vgg16 model, but I also have not trained vgg19 model which gave me better results in slow neural style from fzliu. Can I switch to training vgg19 model with your script?
Here's my results in vgg16:)
trained image
stasio

th train.lua -h5_file ../Baz.h5 -style_image ../stasio.jpg -style_image_size 256 -content_weights 1.0 -style_weights 5.0 -checkpoint_name stasio -gpu 0 -use_cudnn 1 -backend cuda -batch_size 2 -checkpoint_every 100

ay

th train.lua -h5_file ../Baz.h5 -style_image ../stasio.jpg -style_image_size 300 -content_weights 3.0 -style_weights 1.0 -checkpoint_name ztasio -gpu 0 -use_cudnn 1 -backend cuda -batch_size 2 -checkpoint_every 100 -arch c9s1-16,d32,d64,R64,R64,R64,R64,R64,U2,U2,c9s1-3
az
th train.lua -h5_file ../Baz.h5 -style_image ../stasio.jpg -style_image_size 300 -content_weights 3.0 -style_weights 5.0 -checkpoint_name Xtasio -gpu 0 -use_cudnn 1 -backend cuda -batch_size 2 -checkpoint_every 100 -arch c9s1-16,d32,d64,R64,R64,R64,R64,R64,U2,U2,c9s1-3 -max_train 1
azz

th train.lua -h5_file ../Baz.h5 -style_image ../stasio.jpg -style_image_size 300 -content_weights 0.5 -style_weights 8.0 -checkpoint_name Ytasio -gpu 0 -use_cudnn 1 -backend cuda -batch_size 2 -checkpoint_every 100 -arch c9s1-16,d32,d64,R64,R64,R64,R64,R64,U2,U2,c9s1-3
azzz

The first model is the best, but It have some glitches that I want to erease in future training sessions.
Everything was trained with COCO 40k images.

from fast-neural-style.

htoyryla avatar htoyryla commented on September 7, 2024

I see you have copied the arch c9s1-16,d32,d64,R64,R64,R64,R64,R64,U2,U2,c9s1-3 from my comment. As commented by @jcjohnson in another thread, that might be a poor choice, and perhaps one should add a conv layer between the two U2 layers (even if for me it worked without).

from fast-neural-style.

piteight avatar piteight commented on September 7, 2024

Yes, I thought I would give it a try, to see different methods than only changing weights of style and content. The second version is quite good, because of the sky, except this noise pattern. It worket good for chicago.jpg,
chicagoz
but in example with person, the result was very poor:
ztkaw

first example gave me this output:
out
tkaw

I will try putting the conv layer as You mentioned :)

from fast-neural-style.

universewill avatar universewill commented on September 7, 2024

@htoyryla can u make your dataset and pretrained models' parameters available?

from fast-neural-style.

htoyryla avatar htoyryla commented on September 7, 2024

@htoyryla can u make your dataset and pretrained models' parameters available?

After almost three years of doing other things, no, sorry.

from fast-neural-style.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.