Code Monkey home page Code Monkey logo

Comments (4)

woctezuma avatar woctezuma commented on July 24, 2024 1

I assume that what you call a "StyleGAN encoder" is code like this one. Similarly to what was then suggested in the StyleGAN2 article, the aim is to project a real image in the latent space of the generative model. So what you get is one of the closest generated images to your high-resolution input. Images are close in high-resolution.

The original paper is about training the StyleGAN2 generative model, and the ability to project real images is a nice property. People have then built tools to do so because they wanted to have fun editing real images. Indeed, this allows to edit images by moving from the projection along specified latent directions. You will get results like this one, where Mona Lisa's pose is changed.

StyleGAN2 encoder and editor

In contrast, PULSE looks for plausible images in the latent space which would downscale correctly with respect to the low-resolution image input, e.g. PULSE produces a plausible 1024x1024 image which would downscale close to a 16x16 image input. There are many potential 1024x1024 images, so what you get is one among very many suitable candidates. Images are close in low-resolution.

You can play with Colab.

Figure 3 in the PULSE paper

Overall, I agree that one could find that there are broad similarities between the projects, as in one looks for a plausible image in the latent space in both cases. However, as you pointed out, there are differences in terms of input (high-resolution vs. low-resolution) and objective (find the projection vs. find one of very many equally probable candidates).

One cool experiment could be to play with these two Colab notebooks (projector vs. upsampling) and compare the kind of results that you get by feeding a low-resolution image to the StyleGAN2 projector. I would not expect the projector to perform well at all, because it will try to produce a blurry high-resolution image which would look like the low-resolution image but at a high-resolution. It is quite easy to produce unrealistic images with the projector, while PULSE sticks to producing plausible high-resolution images.

Anyway, let me know if I am mistaken, or if I rehashed information which you already knew. ;)

from pulse.

danielkaifeng avatar danielkaifeng commented on July 24, 2024 1

To make a step forward, I think both stylegan-encoder and PULSE couldn't make a general SR application. You can't get a full body person SR unless you train it. So the SR effect rely a lot on your pretrained GAN. You trained a good GAN, you get nice result and vice verse.

Even in face generation, we must align the face first, which means the eyes, mouth etc should in the absolute exact position. It would be much better to generalize it to less preprocess image input.

from pulse.

danielkaifeng avatar danielkaifeng commented on July 24, 2024

With pretrained projector's weight and tricky finetune skill, you can get very similar HR image, like this one:
face1_01
face1_01

For very blur face, you can use a very small learning rate and increase GAN loss to ensure it doesn't generate blur HR image.
image
1_01

I believe stylegan-encoder's projection method can do a good job in face SR, and PULSE improve it's generalization by changing the optimization target from (input vs predict) to (input vs predict's downsampling low-resolution image).

from pulse.

yuqiu1233 avatar yuqiu1233 commented on July 24, 2024

With pretrained projector's weight and tricky finetune skill, you can get very similar HR image, like this one:
face1_01
face1_01

For very blur face, you can use a very small learning rate and increase GAN loss to ensure it doesn't generate blur HR image.
image
1_01

I believe stylegan-encoder's projection method can do a good job in face SR, and PULSE improve it's generalization by changing the optimization target from (input vs predict) to (input vs predict's downsampling low-resolution image).

Can you tell me how to finetune this project? thanks

from pulse.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.