I am doing what you may consider extreme experiments with your code, as a part of my a

I used image pairs like this <a target="_blank" rel="noopener norefe

Strange phenomenon in my extreme experiments about pix2pix HOT 7 CLOSED

phillipi commented on May 16, 2024 3

Strange phenomenon in my extreme experiments

from pix2pix.

Comments (7)

htoyryla commented on May 16, 2024 2

I used image pairs like this

One can take any landscapes and/or city views, resize them to 512x512px (targets) and then crop the center 256x256 and resize to 512x512 (inputs). So the task may be quite difficult for the model, resizing the input and filling in the missing parts, so no wonder if there are artifacts.

I tried already training with the same dataset using PixelGAN, too. It looked ok but very blurry and wasn't improving, so I interrupted and then modified my dataset to pairs like this (cropping the target onto a white background, so that the task is only to fill in the missing part).

So far (around 50 epochs of 2500 images) it produces output like this (from the input as above)

So this version of the dataset keeps the center quite like the input photo and then fills in the rest, already quite seamlessly.

For comparison, here's the same image processed using the model trained with the first dataset. The results are more surrealistic but the artifact is visible in the middle. On the other hand, it is interesting to get a quite painting-like quality from only photos.

My dataset contains many photos from Norway and Switzerland, so the model has a tendency to place mountains on the sides.

I had missed the part that the patches are convolved across the image. I can understand how the zero padding could cause artifacts in the middle when the input is scaled down towards the middle. I will experiment further, changing the code when necessary.

from pix2pix.

phillipi commented on May 16, 2024 1

Very cool! I'd be curious to see some input/output pairs for your experiments. Do you mind sharing those?

By default the code uses the 70x70 PatchGAN. You can change this by setting opt.n_layers_D to a different value (see usage in models.lua, defineD_n_layers).

I'm not sure exactly why you are getting the artifacts in the center. The patch discriminator is run convolutionally across the entire image, so it looks at more than just the center. My guess is the artifact is due to boundary effects: the generator and discriminator both pad the image with zeros, and this means that pixels away from the center of the image have systematically different input than pixels at the center. That definitely creates issues with lots of artifacts near the image borders.

Sometimes these kinds of artifacts will go away if you just train for a really long time :). Another thing you could try is to switch zero padding to mirror padding or something better (will take some coding work).

from pix2pix.

htoyryla commented on May 16, 2024 1

Training with a dataset in which the inputs are cropped out-of-center and resized appear not to produce artifacts (at least not so easily noticeable and not always in the same place). I think this is an interesting finding too. The crop was made towards the upper left corner like this.

Here's some samples of the output:

from pix2pix.

htoyryla commented on May 16, 2024 1

My second dataset, in which the crop was made in the center, not resized, leaving a white border, is probably a more appropriate task of "filling in what's missing in the picture". For some images, it works quite well. The center is from the photo, the rest is added by the model.

Sometimes the effect is not so realistic, the image changes towards the edges as if in a dream (same image as above but after one more day of training).

from pix2pix.

ppwwyyxx commented on May 16, 2024

This is very interesting. One thing to note that the generator architecture (the U-net) seems to be specifically designed for image pairs which share a common structure. This is true for all the cases in the paper, but not for your task.

For your task, however, the center region would be roughly shared between input and output, with a different scale. I think that may explain your high-frequency noise in the center: it tries to compress the center patch of the input. I suggest you try a different crop location (e.g. crop a bit on the left but not at the center), and see if the high-frequency region appears at a different place.

from pix2pix.

htoyryla commented on May 16, 2024

"One thing to note that the generator architecture (the U-net) seems to be specifically designed for image pairs which share a common structure. This is true for all the cases in the paper, but not for your task."

I have been aware of this, and therefore called my experiment "extreme". My second dataset is better suited in this respect as the middle part does not need to be scaled nor moved. The layout of the image stays the same while some content is missing. On the other hand, I am after artistically useful effects, and both sets do indeed produce interesting results. So my findings should absolutely not be interpreted as criticism :)

I will experiment with a different crop position, and with different padding time, but this will take time. Looks like I need to install cuda to my second Linux box, too.

from pix2pix.

htoyryla commented on May 16, 2024

No need to reopen... this is not really an issue in need of a solution, just discussion. If it is not appropriate here, I will continue in my blog. Anyway, my two recent comments were a response to the suggestions I received, so I felt I should report my findings.

Anyway, great tool even for some more unconventional tasks!

from pix2pix.

Strange phenomenon in my extreme experiments about pix2pix HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent