Some questions and observations after trying this out. Nice notebook! This is not an i

questions on parameters about paella HOT 3 CLOSED

dome272 commented on May 25, 2024

questions on parameters

from paella.

Comments (3)

pabloppp commented on May 25, 2024

Hi Paul! I'm glad you're enjoying our work! Let me try to solve these for you!

Is there a way to specify additional hyperparameters such as seed, image size, iterations...?

In order to use a seed, you can use PyTorch random seed, you could do something like:

with torch.random.fork_rng():
    torch.manual_seed(42)
    sampled = sample(...)

As for image size, iterations, etc.. Yeah, the sample method has parameters like:
T & renoise_steps (by default 12 and 11) to define the sampling iterations, we see that for better results, renoise_steps must be equal to T-1
size a tuple (by default (32, 32)) defining the size of the latent tokens that Paella will sample, then the vqGAN will decode them to an image with a resolution x8, so if you want to try a landscape image, you can use size=(32, 64)
There are other parameters, like a mask (for impainting), an initial image (for image2image), etc, take a look here: https://github.com/dome272/Paella/blob/main/utils.py#L29

When you set the batch size to 1, the resulting image displayed is huge (not accurate to the actual resolution)

This is because we use matplotlib to display the images in the notebook. You can set a height & width in the showimages method, but you probably won't get the actual image size. Best way to get the actual size is to save the images, for example with torchvision.utils.save_image or our provided method saveimages which brings me to the third question...

It would be great if the output (under /content/output in colab) could be one image. I understand that i can set batch size to 1, but I am looking to put in a number, and have that many images saved under /content/output. Having images glued together horizontally is less appealing (for me).

The function saveimags expects a batch of images, and will automatically make a grid, but you can just call it with each image separately instead:

for i, img in enumerate(sampled):
    saveimages(img, mode + "_" + text + f"_{i}", nrow=len(sampled))

This should create a file for each image in the batch.

from paella.

metaphorz commented on May 25, 2024

Pablo: This is a really nice detailed response. Thanks! I set the seed as specified and this is working. Not sure what T and renoise_steps can be set to in terms of nominal values. For size, on an A100, I was able to do (64,64) maximum, which yields a 512x512 image. Your loop suggestion worked great, as I now get individual output files.

from paella.

dome272 commented on May 25, 2024

Hey there, technically you should be able to sample higher resolution images. I can sample 128x128 latents on an A100. Maybe you can try removing the CLIP visual part if you only to text-conditional sampling like here: https://github.com/dome272/Paella/blob/main/paella_minimal.py#L34
The clip.visual is a >1B parameters model which takes up a lot of memory. So maybe that helps?

from paella.

questions on parameters about paella HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent