Code Monkey home page Code Monkey logo

Comments (30)

titu1994 avatar titu1994 commented on July 28, 2024

@bmaltais Hmm this is strange. Have you tried using some other color than gray for that region? I'll try testing this with another mask.

EDIT:
I was able to reproduce this issue. Seems changing the color or shape has no effect. Will try to find some solution to this.
neural doodle

from neural-style-transfer.

bmaltais avatar bmaltais commented on July 28, 2024

One thing I just thought of... perhaps the mask png files actually contain more than 4 colors and this would explain why some of the masks are ignored since we specify 4 mask layers? I will need to check the mask file when I get back home tonight. It could be useful to actually analyse each mask and report if mask color # differ and are not matching from an RGB value point of view. Perhaps even self seeding the number of mask layers based on the the mask image. this way this would not be needed as a parameter anymore.

from neural-style-transfer.

titu1994 avatar titu1994 commented on July 28, 2024

It's an interesting concept for sure, but won't it hurt startup performance of the script ? A naïve implementation would be creating an empty set and adding np.unique tuples of RGB colors into the set then check if set size is equal to ncolors.

I may be missing some obvious way of doing so but checking even a 512x512x3 image will require some time. Can you suggest some quicker way of calculating number of unique RGB tuples in a numpy array ?

EDIT:
This can be avoided by using a tool such as GIMP or Photoshop to create the masks. Since there is a fill selection by color option, it guarantees that there will be a finite set of colors in the mask.

from neural-style-transfer.

bmaltais avatar bmaltais commented on July 28, 2024

I am sorry, I don't. I am not really well versed in pyton and neural network programmation. Just enough knowledge to be dangerous ;-)

But thinking of it it would make sense to potentially select the mask colors based on the largest number of pixels. Let say there are 5 colors in a mask file... but only 4 are a majority. The 5th is just a few pixels here and there. Those should be ignore in favor of the others.

I am not sure how this would be done in code unfortunately. Right now the easiest is probably to use photoshop or GIMP to clamp the max # of colors to 4 and make sure RGB of each is matching between the 2 masks.

from neural-style-transfer.

bmaltais avatar bmaltais commented on July 28, 2024

Look like something like this on each mask file would ensure that only n colors are present:

http://scikit-learn.org/stable/auto_examples/cluster/plot_color_quantization.html

Potantial next challenge is to ensure color match between two mask files.

from neural-style-transfer.

titu1994 avatar titu1994 commented on July 28, 2024

What you described (5 colors but only 4 major colors) is already handled by K Means algorithm. Very similar colors should automatically join the correct clusters.

Checking for RGB match would incur the same performance penalty of checking an entire image for number of unique colors. It's a good suggestion though, and I may have to enforce it eventually, but for now using GIMP or PS is recommended.

I believe that the red masked area is somehow not sufficient in describing what needs to be drawn.

I tried to merge the red and green colors into one (that area was made green), set ncolors to 3 and got the appropriate results but no blue swatches that are described by the red masked area.

from neural-style-transfer.

bmaltais avatar bmaltais commented on July 28, 2024

I see. I might try to run the masks through a modified version of the plot_color_quantization and use the output instead of the raw masks. I will search for code to "match" colors between two pictures as a next stage.

Might not fix the red mask issue present here but at least would automate mask color matching.

from neural-style-transfer.

titu1994 avatar titu1994 commented on July 28, 2024

The code is plenty simple actually:

set( tuple(val) for x in img for val in x)

This run on the two different masks and then checking between them.

I'm worried about running it on large images (512 x 512 x 3 and even higher)

from neural-style-transfer.

bmaltais avatar bmaltais commented on July 28, 2024

Wouldn't the speed penalty just a one time hit at start time? Should not take minutes... only seconds?

from neural-style-transfer.

titu1994 avatar titu1994 commented on July 28, 2024

I'll run a benchmark when I reach home. If it's simply seconds, I'll put it in the script as a warning rather than an exception.

from neural-style-transfer.

titu1994 avatar titu1994 commented on July 28, 2024

I just realized something. It doesn't matter how many colors are matching between each of the target and style masks. Any similar color will simply be put into one of k classes using kmeans. The user has to direct the script as to how many regions (cluster centers / major colors) there are, individual colors are not considered. Once clustering occurs, only that many number of "colors" centers are taken into account.

This is not the reason that the script is failing to properly draw the area that has been masked in red. As to speed, since images are manipulated as numpy arrays, I agree that it won't take more than a few seconds even for large images. It just isn't necessary.

As a test to see if number of colors match (not needed due to Kmeans) (may contain indentation errors, typing on phone :P):

from keras.preprocessing.image import load_img, img_to_array

style_mask_path = "" # put path to style mask
target_mask_path = "" # put path to target mask

img_nrows = img_ncols = 512 

target_mask_img = load_img(target_mask_path, target_size=(img_nrows, img_ncols))
target_mask_img = img_to_array(target_mask_img)

style_mask_img = load_img(style_mask_path, target_size=(img_nrows, img_ncols))
style_mask_img = img_to_array(style_mask_img)

print(target_mask_img.shape)
print(style_mask_img.shape)

target_set = set()
for i in range(img_nrows):
    for j in range(img_ncols):
        target_set.add(tuple(target_mask_img[:, i, j]))

style_set = set()
for i in range(img_nrows):
    for j in range(img_ncols):
        style_set.add(tuple(style_mask_img[:, i, j]))

combined_set = style_set & target_set

print(target_set)
print(style_set)
print(combined_set)

print(len(target_set), len(style_set), len(combined_set))

This short script will probably tell that there are no matches in combined set (or maybe 1-2 matches). It doesn't matter if individual colors are not same between the two masks.

from neural-style-transfer.

bmaltais avatar bmaltais commented on July 28, 2024

Look like some progress is made on this issue in the keras discussion. Nice to see. It got me thinking of how it would be possible to initialize the content image using the following approach:

For each mask area find the dominant colour of that area using the style image and then "paint" that color on the corresponding mask area of the content mask. Repeat for each mask colors. This way, instead of using noise to initialize the output, the dominant color from the source style area would be used... possibly resulting in much sharper result and faster convergence... Possibly adding some light random noise over the dominant color might also help with the resulting content output..

Another approach could be to use random part of the source style masked area and paste to the content mask area to initialize... then add some blur to soften and allow for better stylisation.

from neural-style-transfer.

titu1994 avatar titu1994 commented on July 28, 2024

@bmaltais This issue has now been fixed as per the discussion in keras-team/keras#3731 (comment)

The fix temporarily disables the area of mask weightage that was being applied earlier (see lines #238-239 in improved_neural_doodle.py for comment).

This mask weight was causing more stress to be given to the sky (largest area = highest weight) as compared to the red region (least area = least weight). Now all of the regions will share the same weight and thus be drawn equally.

Result (50 iterations) :
starry_night_at_iteration_50

from neural-style-transfer.

bmaltais avatar bmaltais commented on July 28, 2024

Looking much better but still produce odd looking result. Wonder in initializing with a content image would help. Will need to play around a bit.

from neural-style-transfer.

titu1994 avatar titu1994 commented on July 28, 2024

@bmaltais I did try using the target mask itself as the initial image. End result was horrible. Perhaps adding noise to the target image can produce slightly better output? It should be tested at least.

from neural-style-transfer.

bmaltais avatar bmaltais commented on July 28, 2024

@titu1994 I saw that also. In fact I resurected some past test I did using https://github.com/alexjc/neural-doodle to compare results. This is what I got. I think neural-doodle approach of "phases" where resolution is gradually increased to the final resolution is actually helping a lot with the visual. Attempting to do a single phase run will produce bad doodle when compared to a 3 or 4 phase render.

Here are some comparative pictures: http://imgur.com/a/jKhra This one is using the content image as shown. Overall the style features selected appear to be similar but different looking when compared to neural-doodle. Perhaps the size of them is too small?

Also, it appear as if it is trying to stay "too true" to the content when in fact I told it to use 0 as content_weight.

from neural-style-transfer.

titu1994 avatar titu1994 commented on July 28, 2024

@bmaltais Certain improvements have been made by dolaameng. See keras-team/keras#3731 (comment)

These changes have been merged into both doodle scripts and I'm running a few example images now. Will update the images in readme.md in some time.

from neural-style-transfer.

bmaltais avatar bmaltais commented on July 28, 2024

I will rerun my previous test and see how it now compare. I will update with the result.

OK. Sur running 15 epoch clearly the result is not going to be all that much better. In some ways it look a bit worst (look at the sky): http://imgur.com/a/XTqIa

from neural-style-transfer.

bmaltais avatar bmaltais commented on July 28, 2024

Manually applying a "phased" methodology appear to help a lot with the final result... similarly as with neural-doodle. Perhaps this should be implemented. Adding a "--phase" option with potentially an improvement cap option to to stop iterating lower phases when improvement % is less than specify.

See last image of http://imgur.com/a/XTqIa

This provides a much better result vs using noise initialisation with single phase.

Let me explain what phases are. Let say you want 3 phases. The original picture size is divided by n+1... so if the picture was 500x333 the 1st phase picture size would be 500x333/4 = 125 x 83

Phase 2 would be 250 x 166 and phase 3 500x333

This essentially apply a form of super resolution as it progress through resolution doubling. Improving on each with more and more details.

from neural-style-transfer.

titu1994 avatar titu1994 commented on July 28, 2024

@bmaltais Do you have a working script that implements phases?

from neural-style-transfer.

bmaltais avatar bmaltais commented on July 28, 2024

@titu1994 I don't but was thinking of creating one using bash. One issue with the current code is that when you set size with --img_size aspect ratio is thrown out the window... resulting in a square output. This has the drawback of increasing iteration time and require resizing at the end.

Here is the series of commands I used to produce the phased result:

python improved_neural_doodle.py --nlabels 4 --style-image ../in/vg/srcl.jpg --style-mask ../in/vg/srcl-m.png --target-mask ../in/vg/dst-m.png  --target-image-prefix ./doodle3-125 --content_weight 0 --style_weight 1000000 --num_iter 50 --content-image ../in/vg/dsto.jpg --img_size 125

python improved_neural_doodle.py --nlabels 4 --style-image ../in/vg/srcl.jpg --style-mask ../in/vg/srcl-m.png --target-mask ../in/vg/dst-m.png  --target-image-prefix ./doodle3-250 --content_weight 0 --style_weight 1000000 --num_iter 50 --content-image ./doodle3-125_at_iteration_16.png --img_size 250

python improved_neural_doodle.py --nlabels 4 --style-image ../in/vg/srcl.jpg --style-mask ../in/vg/srcl-m.png --target-mask ../in/vg/dst-m.png  --target-image-prefix ./doodle3-500 --content_weight 0 --style_weight 1000000 --num_iter 50 --content-image ./doodle3-250_at_iteration_8.png --img_size 500

Then final resize of .doodle3-500_at_iteration_20.png to 500x333

from neural-style-transfer.

titu1994 avatar titu1994 commented on July 28, 2024

@bmaltais Aspect ratio should still be maintained if --img_size is not = -1 (ie. any other value). # See lines 71-72 in improved_neural_doodle.py

Also do you suggest that each phase go through 50 full iterations? Or is 20 - 20 - 20 sufficient?

from neural-style-transfer.

bmaltais avatar bmaltais commented on July 28, 2024

@titu1994 I can confirm that ratio is not maintained... well maybe it is for the final iteration but intermediary are not... I have not let it go to the 50th iteration so i can't confirm.

Regarding the # of iteration per phase if one specify 50 the it should be 50 per phase... but this is clearly overkill for intermediary phases. I would rather opt for a cutoff based on style improvement percentage... so let say one set the cutoff at 5% as soon as the improvement drob below 5% it move to the next phase... then the last phase would run to 50... unless the same cutoff is applied there too.

UPDATE

Even the last iteration is not properly sized... so ratio is not maintained.

from neural-style-transfer.

titu1994 avatar titu1994 commented on July 28, 2024

@bmaltais Another issue I see is I can't put this logic inside this script itself. Many of the variables which are required (and initialized) when using --content-image are not initialized the first time when we use this script without --content-image.

The way you have done it externally initializes all the variables and losses which are concerned with the content image in the second and later phases. I think an external script which utilizes this script is somehow is the correct way to manage this.

Have any better idea? Calling a script from a script isn't really an optimal solution.

Edit:
I will look to what is happening to the aspect ratio once again. It's being overridden somewhere.

from neural-style-transfer.

titu1994 avatar titu1994 commented on July 28, 2024

@bmaltais Are you sure you have updated the script? Try printing your img_nrows and img_ncols at the end after x has been initialized before the loop. When I am using a --img_size = 400, it auto scales to (400 x 600 for me)

Edit:
I updated the script to early stop if improvement is less than min improvement.

from neural-style-transfer.

titu1994 avatar titu1994 commented on July 28, 2024

I think this is a prime functionality for the windows helper program. The above changes in the three scripts can be easily modified in the program and it runs the script independently of each other.

This leaves non windows users with a problematic scenario, so if you could come up with a bash script, I would be grateful. In addition I will be adding this technique in the Usage section of Neural Doodles.

By the way, what was ../in/vg/dsto.jpg for the first execution ?

from neural-style-transfer.

bmaltais avatar bmaltais commented on July 28, 2024

There must be something different on Linux as the resized image is square instead of maintaining the aspect ratio of the original... strange.

I will see if I can make a proper script.

Doing a quick read of the code... is it possible all content, masks and style images are resized to the exact same size? Doesn't this break the aspect ration of the source style shapes? I might be reading the code wrond on the other hand.

from neural-style-transfer.

bmaltais avatar bmaltais commented on July 28, 2024

OK. Found the issue for the aspect ratio. Line 72 need to be changed to:

aspect_ratio = float(ref_img.shape[1]) / float(ref_img.shape[0])

from neural-style-transfer.

bmaltais avatar bmaltais commented on July 28, 2024

I would also like to propose the following change to the last function:

if args.min_improvement != 0.0:
    if (improvement < args.min_improvement) & (i > 1):
        print("Script is early stopping since improvement (%0.2f) < min improvement (%0.2f)" %
              (improvement, args.min_improvement))
        ffname = target_img_prefix + '.png'
        imsave(ffname, img)
        exit()

This way it does not stop on the 1st iteration since it is always 0% and will save the output with a name that one can predict easilly before exiting. This will simplify scripting.

from neural-style-transfer.

titu1994 avatar titu1994 commented on July 28, 2024

@bmaltais I'll update the calculation of aspect_ratio in both scripts. I did not think that this would differ from platform to platform.

All images are rescaled to the referenced style dimensions. Not the content dimensions (Because content image may not be provided). Therefore, the aspect ratio of style source shapes is always maintained. If a content image of another shape is provided, then it will be rescaled to the size of the style image (which will alter the content image aspect ratio)

I agree with the modified version of min improvement. I overlooked the fact that the first iteration will always be 0. Thanks. I'll update in some time.

Edit:
Updated the two scripts. Thanks for the changes.

from neural-style-transfer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.