Code Monkey home page Code Monkey logo

realfill's People

Contributors

alexsuakim avatar maschenb avatar thuanz123 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

realfill's Issues

process on the border of mask and original images

sorry to bother you again.
I notice that you process the border of dataset masks and original images during inference:
image
I do the process according to paper, but output bad results! would you like to provide the process code ?
Best

Is it possible to adjust this approach for image restoration?

Is it technically possible to adjust that paper to denoise, upscale or maybe colorize images by using references?
Like if i would have a noisy scanned negative from a room, where i have a image of the same scene without much differences from a true digital era image that is way better. Or a higher resolution version of that scene to upscale the target the correct way.

[Inference]

Hi
Thanks for sharing your implementation
I met this issue but I am newbie at Diffuser library ...
So I want to ask you how to fix this error

Traceback (most recent call last):
  File "/workspace/realfill/infer.py", line 71, in <module>
    results = pipe(
  File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/workspace/3DGS-Diff/diffusers/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py", line 1346, in __call__
    latent_model_input = torch.cat([latent_model_input, mask, masked_image_latents], dim=1)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 64 but got size 512 for tensor number 2 in the list.

Here is my script

export MODEL_NAME="stabilityai/stable-diffusion-2-inpainting"
export TRAIN_DIR="data/flowerwoman"
export OUTPUT_DIR="flowerwoman-model"

accelerate launch train_realfill.py \
  --pretrained_model_name_or_path=$MODEL_NAME \
  --train_data_dir=$TRAIN_DIR \
  --output_dir=$OUTPUT_DIR \
  --resolution=512 \
  --train_batch_size=16 \
  --gradient_accumulation_steps=1 \
  --unet_learning_rate=2e-4 \
  --text_encoder_learning_rate=4e-5 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=100 \
  --max_train_steps=2000 \
  --lora_rank=8 \
  --lora_dropout=0.1 \
  --lora_alpha=16
# There's no problem during training !!
accelerate launch infer.py \
  --model_path flowerwoman-model \
  --output_dir $OUTPUT_DIR \
  --validation_image $TRAIN_DIR/target/target.png \
  --validation_mask $TRAIN_DIR/target/mask.png

Thanks in advance

AttributeError: module 'torchvision.transforms.v2' has no attribute 'ToImageTensor'

I'm running the torchvision==0.17.1 specified in the requirements.txt document but that seems to be missing a required attribute. Is there another version of tourch/tourchvision that I should be targeting?

Traceback (most recent call last):
File "train_realfill.py", line 955, in
main(args)
File "train_realfill.py", line 695, in main
train_dataset = RealFillDataset(
File "train_realfill.py", line 449, in init
transforms_v2.ToImageTensor(),
AttributeError: module 'torchvision.transforms.v2' has no attribute 'ToImageTensor'

inference

You guys are doing a great job, and I have a few questions for you:

1 What is the effect of outpaint without lora training?

2 Have you tested the maximum times of outpaint original?

3 There is no direct inference script

License?

Hi, would be great to know the license you’re releasing this under. Ideally Apache 2.0 or MIT?

Potential bug

Hi author,

Thanks for sharing the prompt implementation!

I just noticed a potential bug at

weightings = self.image_transforms(weighting)

The target image and target weighting are randomly augmented separately, which will cause the two things misalign and leaks the info of the masked region into the model.
You probably want to use torchvision.tv_tensors and transforms_v2.RandomCrop.

Please kindly let me know if I messed up anything, since I only read the codes and haven't run it yet.

Weighting for MSE Loss

Hi,

Thanks for your work. I had one question about the weighting variable used for the MSE loss for noise residuals with reference/target images.

Since you define it as

weighting = Image.new("L", image.size)

or as the mask of a target image, the condition

example["weightings"] = weighting < 0

would zero out the loss for both reference and target images as weighting becomes zero for the latent space after interpolation from the pixel space. As the masks are either 0 or 1, the weightings tensor would be 0 everywhere. I think the correct condition would be

example["weightings"] = weighting <= 0

Please correct me if I am wrong.

Version of transformers

Hi! I have some troubles when training the realfill.
When I used the latest version of transformers, something went wrong:OSError: stabilityai/stable-diffusion-2-inpainting does not appear to have a file named tokenizer/config.json.
I found some suggestions here and re-installed transformers==4.22.1, but another bug occured:ImportError: cannot import name 'CLIPTextModelWithProjection' from 'transformers. May I ask which version of transformers could work in realfill?
Thanks a lot!

[suggestion] scene variants example

Hi. Thank you for your code! I had read the orignal paper it say :

Figure 9. RealFill is able to generate multiple scene variants when conditioned on a blank image as input, e.g., people are added or removed in the first and second rows. This suggests that the finetuned model can relate the elements inside the scene in a compositional manner.

I am comfusing what the black image meaning , could you also give a exmaple about this case , thank you your reply

image

AttributeError: 'DistributedDataParallel' object has no attribute 'base_model'

Hello. Thank you for your code! However, I'm getting the following error when I run it on a machine with 2 NVIDIA RTX A6000s after 500 steps.

Steps:  25%|████████████████████████                                                                        | 500/2000 [34:15<1:39:43,  3.99s/it, loss=0.0361]11/09/2023 21:23:13 - INFO - accelerate.accelerator - Saving current state to flowerwoman-model/checkpoint-500
Traceback (most recent call last):
  File "train_realfill.py", line 956, in <module>
    main(args)
  File "train_realfill.py", line 894, in main
    accelerator.save_state(save_path)
  File "/usr/local/lib/python3.8/dist-packages/accelerate/accelerator.py", line 2793, in save_state
    hook(self._models, weights, output_dir)
  File "train_realfill.py", line 634, in save_model_hook
    sub_dir = "unet" if isinstance(model.base_model.model, type(accelerator.unwrap_model(unet.base_model.model))) else "text_encoder"
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1614, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'DistributedDataParallel' object has no attribute 'base_model'

I'm working in a docker. Here are the steps I take to get to that error with your default example:

#start a container like so:
#docker run -it --rm --ipc=host --gpus all nvidia/cuda:11.7.1-devel-ubuntu20.04

apt update
apt install git
apt install python3-pip

mkdir GitClones
cd GitClones
git clone https://github.com/thuanz123/realfill
cd realfill
pip install -r requirements.txt 

accelerate config default
#Run example https://github.com/thuanz123/realfill#toy-example

I would be grateful for any feedback!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.