Code Monkey home page Code Monkey logo

Comments (3)

pabloppp avatar pabloppp commented on May 25, 2024

Hi Paul! I'm glad you're enjoying our work! Let me try to solve these for you!

Is there a way to specify additional hyperparameters such as seed, image size, iterations...?

In order to use a seed, you can use PyTorch random seed, you could do something like:

with torch.random.fork_rng():
    torch.manual_seed(42)
    sampled = sample(...)

As for image size, iterations, etc.. Yeah, the sample method has parameters like:
T & renoise_steps (by default 12 and 11) to define the sampling iterations, we see that for better results, renoise_steps must be equal to T-1
size a tuple (by default (32, 32)) defining the size of the latent tokens that Paella will sample, then the vqGAN will decode them to an image with a resolution x8, so if you want to try a landscape image, you can use size=(32, 64)
There are other parameters, like a mask (for impainting), an initial image (for image2image), etc, take a look here: https://github.com/dome272/Paella/blob/main/utils.py#L29

When you set the batch size to 1, the resulting image displayed is huge (not accurate to the actual resolution)

This is because we use matplotlib to display the images in the notebook. You can set a height & width in the showimages method, but you probably won't get the actual image size. Best way to get the actual size is to save the images, for example with torchvision.utils.save_image or our provided method saveimages which brings me to the third question...

It would be great if the output (under /content/output in colab) could be one image. I understand that i can set batch size to 1, but I am looking to put in a number, and have that many images saved under /content/output. Having images glued together horizontally is less appealing (for me).

The function saveimags expects a batch of images, and will automatically make a grid, but you can just call it with each image separately instead:

for i, img in enumerate(sampled):
    saveimages(img, mode + "_" + text + f"_{i}", nrow=len(sampled))

This should create a file for each image in the batch.

from paella.

metaphorz avatar metaphorz commented on May 25, 2024

Pablo: This is a really nice detailed response. Thanks! I set the seed as specified and this is working. Not sure what T and renoise_steps can be set to in terms of nominal values. For size, on an A100, I was able to do (64,64) maximum, which yields a 512x512 image. Your loop suggestion worked great, as I now get individual output files.

from paella.

dome272 avatar dome272 commented on May 25, 2024

Hey there, technically you should be able to sample higher resolution images. I can sample 128x128 latents on an A100. Maybe you can try removing the CLIP visual part if you only to text-conditional sampling like here: https://github.com/dome272/Paella/blob/main/paella_minimal.py#L34
The clip.visual is a >1B parameters model which takes up a lot of memory. So maybe that helps?

from paella.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.