Code Monkey home page Code Monkey logo

Comments (7)

htoyryla avatar htoyryla commented on May 16, 2024 2

I used image pairs like this

mar05-0144 5

One can take any landscapes and/or city views, resize them to 512x512px (targets) and then crop the center 256x256 and resize to 512x512 (inputs). So the task may be quite difficult for the model, resizing the input and filling in the missing parts, so no wonder if there are artifacts.

I tried already training with the same dataset using PixelGAN, too. It looked ok but very blurry and wasn't improving, so I interrupted and then modified my dataset to pairs like this (cropping the target onto a white background, so that the task is only to fill in the missing part).

mar05-0144 6

So far (around 50 epochs of 2500 images) it produces output like this (from the input as above)

mar05-0144 4

So this version of the dataset keeps the center quite like the input photo and then fills in the rest, already quite seamlessly.

For comparison, here's the same image processed using the model trained with the first dataset. The results are more surrealistic but the artifact is visible in the middle. On the other hand, it is interesting to get a quite painting-like quality from only photos.

mar05-0144

My dataset contains many photos from Norway and Switzerland, so the model has a tendency to place mountains on the sides.

I had missed the part that the patches are convolved across the image. I can understand how the zero padding could cause artifacts in the middle when the input is scaled down towards the middle. I will experiment further, changing the code when necessary.

from pix2pix.

phillipi avatar phillipi commented on May 16, 2024 1

Very cool! I'd be curious to see some input/output pairs for your experiments. Do you mind sharing those?

By default the code uses the 70x70 PatchGAN. You can change this by setting opt.n_layers_D to a different value (see usage in models.lua, defineD_n_layers).

I'm not sure exactly why you are getting the artifacts in the center. The patch discriminator is run convolutionally across the entire image, so it looks at more than just the center. My guess is the artifact is due to boundary effects: the generator and discriminator both pad the image with zeros, and this means that pixels away from the center of the image have systematically different input than pixels at the center. That definitely creates issues with lots of artifacts near the image borders.

Sometimes these kinds of artifacts will go away if you just train for a really long time :). Another thing you could try is to switch zero padding to mirror padding or something better (will take some coding work).

from pix2pix.

htoyryla avatar htoyryla commented on May 16, 2024 1

Training with a dataset in which the inputs are cropped out-of-center and resized appear not to produce artifacts (at least not so easily noticeable and not always in the same place). I think this is an interesting finding too. The crop was made towards the upper left corner like this.

15203379_10154886023803729_2856478817454046303_n

Here's some samples of the output:

15232268_10154886018823729_4784194833918312388_n
15171106_10154886018248729_4003515911918145033_n
15232330_10154886017943729_8406771162816287196_n
15181332_10154886016958729_2569737452062136022_n
15193440_10154886016738729_3581271850734963575_n

from pix2pix.

htoyryla avatar htoyryla commented on May 16, 2024 1

My second dataset, in which the crop was made in the center, not resized, leaving a white border, is probably a more appropriate task of "filling in what's missing in the picture". For some images, it works quite well. The center is from the photo, the rest is added by the model.

dsc00625

Sometimes the effect is not so realistic, the image changes towards the edges as if in a dream (same image as above but after one more day of training).

mar05-0144

from pix2pix.

ppwwyyxx avatar ppwwyyxx commented on May 16, 2024

This is very interesting. One thing to note that the generator architecture (the U-net) seems to be specifically designed for image pairs which share a common structure. This is true for all the cases in the paper, but not for your task.

For your task, however, the center region would be roughly shared between input and output, with a different scale. I think that may explain your high-frequency noise in the center: it tries to compress the center patch of the input. I suggest you try a different crop location (e.g. crop a bit on the left but not at the center), and see if the high-frequency region appears at a different place.

from pix2pix.

htoyryla avatar htoyryla commented on May 16, 2024

"One thing to note that the generator architecture (the U-net) seems to be specifically designed for image pairs which share a common structure. This is true for all the cases in the paper, but not for your task."

I have been aware of this, and therefore called my experiment "extreme". My second dataset is better suited in this respect as the middle part does not need to be scaled nor moved. The layout of the image stays the same while some content is missing. On the other hand, I am after artistically useful effects, and both sets do indeed produce interesting results. So my findings should absolutely not be interpreted as criticism :)

I will experiment with a different crop position, and with different padding time, but this will take time. Looks like I need to install cuda to my second Linux box, too.

from pix2pix.

htoyryla avatar htoyryla commented on May 16, 2024

No need to reopen... this is not really an issue in need of a solution, just discussion. If it is not appropriate here, I will continue in my blog. Anyway, my two recent comments were a response to the suggestions I received, so I felt I should report my findings.

Anyway, great tool even for some more unconventional tasks!

from pix2pix.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.