thuanz123 / realfill Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
Is it technically possible to adjust that paper to denoise, upscale or maybe colorize images by using references?
Like if i would have a noisy scanned negative from a room, where i have a image of the same scene without much differences from a true digital era image that is way better. Or a higher resolution version of that scene to upscale the target the correct way.
Hi
Thanks for sharing your implementation
I met this issue but I am newbie at Diffuser library ...
So I want to ask you how to fix this error
Traceback (most recent call last):
File "/workspace/realfill/infer.py", line 71, in <module>
results = pipe(
File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/workspace/3DGS-Diff/diffusers/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py", line 1346, in __call__
latent_model_input = torch.cat([latent_model_input, mask, masked_image_latents], dim=1)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 64 but got size 512 for tensor number 2 in the list.
Here is my script
export MODEL_NAME="stabilityai/stable-diffusion-2-inpainting"
export TRAIN_DIR="data/flowerwoman"
export OUTPUT_DIR="flowerwoman-model"
accelerate launch train_realfill.py \
--pretrained_model_name_or_path=$MODEL_NAME \
--train_data_dir=$TRAIN_DIR \
--output_dir=$OUTPUT_DIR \
--resolution=512 \
--train_batch_size=16 \
--gradient_accumulation_steps=1 \
--unet_learning_rate=2e-4 \
--text_encoder_learning_rate=4e-5 \
--lr_scheduler="constant" \
--lr_warmup_steps=100 \
--max_train_steps=2000 \
--lora_rank=8 \
--lora_dropout=0.1 \
--lora_alpha=16
# There's no problem during training !!
accelerate launch infer.py \
--model_path flowerwoman-model \
--output_dir $OUTPUT_DIR \
--validation_image $TRAIN_DIR/target/target.png \
--validation_mask $TRAIN_DIR/target/mask.png
Thanks in advance
I'm running the torchvision==0.17.1 specified in the requirements.txt document but that seems to be missing a required attribute. Is there another version of tourch/tourchvision that I should be targeting?
Traceback (most recent call last):
File "train_realfill.py", line 955, in
main(args)
File "train_realfill.py", line 695, in main
train_dataset = RealFillDataset(
File "train_realfill.py", line 449, in init
transforms_v2.ToImageTensor(),
AttributeError: module 'torchvision.transforms.v2' has no attribute 'ToImageTensor'
You guys are doing a great job, and I have a few questions for you:
1 What is the effect of outpaint without lora training?
2 Have you tested the maximum times of outpaint original?
3 There is no direct inference script
I just wanna try this out in colab, thanks.
Hi, would be great to know the license you’re releasing this under. Ideally Apache 2.0 or MIT?
Have you ever tested how long it takes to train an image? Also, why is the max_train_steps set to 2000 when it was previously set to 400?
Hi author,
Thanks for sharing the prompt implementation!
I just noticed a potential bug at
Line 481 in 3036340
torchvision.tv_tensors
and transforms_v2.RandomCrop
.
Please kindly let me know if I messed up anything, since I only read the codes and haven't run it yet.
Hi,
Thanks for your work. I had one question about the weighting variable used for the MSE loss for noise residuals with reference/target images.
Since you define it as
weighting = Image.new("L", image.size)
or as the mask of a target image, the condition
example["weightings"] = weighting < 0
would zero out the loss for both reference and target images as weighting becomes zero for the latent space after interpolation from the pixel space. As the masks are either 0 or 1, the weightings tensor would be 0 everywhere. I think the correct condition would be
example["weightings"] = weighting <= 0
Please correct me if I am wrong.
Hi! I have some troubles when training the realfill.
When I used the latest version of transformers, something went wrong:OSError: stabilityai/stable-diffusion-2-inpainting does not appear to have a file named tokenizer/config.json
.
I found some suggestions here and re-installed transformers==4.22.1
, but another bug occured:ImportError: cannot import name 'CLIPTextModelWithProjection' from 'transformers
. May I ask which version of transformers could work in realfill?
Thanks a lot!
Hi. Thank you for your code! I had read the orignal paper it say :
Figure 9. RealFill is able to generate multiple scene variants when conditioned on a blank image as input, e.g., people are added or removed in the first and second rows. This suggests that the finetuned model can relate the elements inside the scene in a compositional manner.
I am comfusing what the black image meaning , could you also give a exmaple about this case , thank you your reply
Hello author, your work is really great. When studying your work, I found that using the parameters and test data provided by the author, the data generated by extrapolation has obvious discontinuous traces at the mask boundary. Is there any way to eliminate it.
I really hope to receive your letter.
Hey the target in the example data, how do you make the mask?
Hello. Thank you for your code! However, I'm getting the following error when I run it on a machine with 2 NVIDIA RTX A6000s after 500 steps.
Steps: 25%|████████████████████████ | 500/2000 [34:15<1:39:43, 3.99s/it, loss=0.0361]11/09/2023 21:23:13 - INFO - accelerate.accelerator - Saving current state to flowerwoman-model/checkpoint-500
Traceback (most recent call last):
File "train_realfill.py", line 956, in <module>
main(args)
File "train_realfill.py", line 894, in main
accelerator.save_state(save_path)
File "/usr/local/lib/python3.8/dist-packages/accelerate/accelerator.py", line 2793, in save_state
hook(self._models, weights, output_dir)
File "train_realfill.py", line 634, in save_model_hook
sub_dir = "unet" if isinstance(model.base_model.model, type(accelerator.unwrap_model(unet.base_model.model))) else "text_encoder"
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1614, in __getattr__
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'DistributedDataParallel' object has no attribute 'base_model'
I'm working in a docker. Here are the steps I take to get to that error with your default example:
#start a container like so:
#docker run -it --rm --ipc=host --gpus all nvidia/cuda:11.7.1-devel-ubuntu20.04
apt update
apt install git
apt install python3-pip
mkdir GitClones
cd GitClones
git clone https://github.com/thuanz123/realfill
cd realfill
pip install -r requirements.txt
accelerate config default
#Run example https://github.com/thuanz123/realfill#toy-example
I would be grateful for any feedback!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.