researchmm / aot-gan-for-inpainting Goto Github PK

View Code? Open in Web Editor NEW

403.0 4.0 64.0 25.39 MB

[TVCG'2023] AOT-GAN for High-Resolution Image Inpainting (codebase for image inpainting)

Home Page: https://arxiv.org/abs/2104.01431

License: Apache License 2.0

Python 100.00%

image-inpainting codebase high-resolution multi-scale

aot-gan-for-inpainting's Introduction

AOT-GAN for High-Resolution Image Inpainting

Arxiv Paper |

AOT-GAN: Aggregated Contextual Transformations for High-Resolution Image Inpainting
Yanhong Zeng, Jianlong Fu, Hongyang Chao, and Baining Guo.

Citation

If any part of our paper and code is helpful to your work, please generously cite and star us 😘 😘 😘 !

@inproceedings{yan2021agg,
  author = {Zeng, Yanhong and Fu, Jianlong and Chao, Hongyang and Guo, Baining},
  title = {Aggregated Contextual Transformations for High-Resolution Image Inpainting},
  booktitle = {Arxiv},
  pages={-},
  year = {2020}
}

Introduction

Despite some promising results, it remains challenging for existing image inpainting approaches to fill in large missing regions in high resolution images (e.g., 512x512). We analyze that the difﬁculties mainly drive from simultaneously inferring missing contents and synthesizing fine-grained textures for a extremely large missing region. We propose a GAN-based model that improves performance by,

Enhancing context reasoning by AOT Block in the generator. The AOT blocks aggregate contextual transformations with different receptive fields, allowing to capture both informative distant contexts and rich patterns of interest for context reasoning.
Enhancing texture synthesis by SoftGAN in the discriminator. We improve the training of the discriminator by a tailored mask-prediction task. The enhanced discriminator is optimized to distinguish the detailed appearance of real and synthesized patches, which can in turn facilitate the generator to synthesize more realistic textures.

Results

Prerequisites

python 3.8.8
pytorch (tested on Release 1.8.1)

Installation

Clone this repo.

git clone [email protected]:researchmm/AOT-GAN-for-Inpainting.git
cd AOT-GAN-for-Inpainting/

For the full set of required Python packages, we suggest create a Conda environment from the provided YAML, e.g.

conda env create -f environment.yml
conda activate inpainting

Datasets

download images and masks
specify the path to training data by --dir_image and --dir_mask.

Getting Started

Training:
- Our codes are built upon distributed training with Pytorch.
- Run
```
cd src
python train.py
```
Resume training:
```
cd src
python train.py --resume
```

Testing:

cd src
python test.py --pre_train [path to pretrained model]

Evaluating:

cd src
python eval.py --real_dir [ground truths] --fake_dir [inpainting results] --metric mae psnr ssim fid

Pretrained models

CELEBA-HQ | Places2

Download the model dirs and put it under experiments/

Demo

Download the pre-trained model parameters and put it under experiments/
Run by

cd src
python demo.py --dir_image [folder to images]  --pre_train [path to pre_trained model] --painter [bbox|freeform]

Press '+' or '-' to control the thickness of painter.
Press 'r' to reset mask; 'k' to keep existing modifications; 's' to save results.
Press space to perform inpainting; 'n' to move to next image; 'Esc' to quit demo.

TensorBoard

Visualization on TensorBoard for training is supported.

Run tensorboard --logdir [log_folder] --bind_all and open browser to view training progress.

LICENSE

This project is released under the Apache 2.0 license. Please see the LICENSE file for more information.

Acknowledgements

We would like to thank edge-connect, EDSR_PyTorch.

aot-gan-for-inpainting's People

Contributors

Stargazers

Watchers

Forkers

trendingtechnology iplinks zengyh1900 cv-ip lguduy hkmtechnology kilisima hackday2021-celiswhite alperkara20 graphchart xelawk boboyiyi yahyasonmez brknulusoy gm0606 sydchao felizang markjhonbao mishav78 xiaotian3 codertcm nimaboscarino rahulbhalley starry8004 chenyouxin113 amudhaajay xijunjun karl-richter hyunobae yws2017 rom1detroyes fathshalaby liuqinglong110 shuowang-ai zborger 23hyun xiaomile lianfei mariobiuuuu samsgates dutra-apex ajishpradeep tree-sun lht667 arvind-india huihuo celns amangupta2303 paperwave andresjimw kk2w jiewei119 sxrczh thleaves xxmiprai pj7022wg wblyqq cnotdj kevinwielander kinantanbagaspati bool-gao silenzio777

aot-gan-for-inpainting's Issues

Unable to train

I can't use my own training set, I prepared some 512*512 images for training, but my progress bar keeps staying at 0,
like this:

[**] create folder ../experiments/aotgan_places2_pconv512
UserWarning: The parameter 'pretrained' is deprecated since 0.13 and will be removed in 0.15, please use 'weights' instead.
warnings.warn(
UserWarning: Arguments other than a weight enum or None for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passing weights=VGG19_Weights.IMAGENET1K_V1. You can also use weights=VGG19_Weights.DEFAULT to get the most up-to-date weights.
warnings.warn(msg)
0%| | 0/1000000 [00:00<?, ?it/s]

Picking the right model for random use-case

Hey there,
Loving the paper!

Noobie question:
I'm wondering, in case I want to choose a random image (not knowing the content) and matching the right pre-trained model to it, is there a way to automate the process or I'll have to detect what the is image metadata to sync it with the right model?
Is it possible to merge the models, so I can feed it with a single output and expect it to handle inpainting even without a context?

用celeba_256数据集从头训练为什么无法收敛？

感谢作者的贡献！请问我用celeba_256从scratch开始训练，但每张图片对应一张固定的mask，且输入图含噪声，算loss时用的是clean的GT，batch_size调成16，iteration为1e6，为什么一直没有收敛呢？
最后的log如下：
mt:7.5s, dt:0.0s, L1:0.390, Style:0.700, Perceptual:0.389, advg:0.006, advd:0.000, : 100%|█████████▉| 999666/1000000 [139:05:29<04:09, 1.34it/s]

Cannot reproduce results from paper with pre-trained model

Hi !

Thanks for your work, the demo code is neat 🤗
I tried the celebahq pre-trained models on images and noticed it doesn't really "complete" parts of the face when you erase them completely (such as eyes, eyebrows, mouth). I tried the same masks you showed in your paper and could not reach the same quality.

This is best I could get for example : am I missing something ?

Bugs in Adversarial Training.

Hi! I ran a few experiments on training the AOT-GAN on the places2 dataset. However, looking at the loss plots seems like the adversarial loss isn't affecting the learning. The magnitude remains the same during the whole course of training.

You can see the experiment results here: https://github.com/praeclarumjj3/AOT-GAN-Experiments#5-experiments.

c++

Very great work, does it support c++ inference?

使用训练权重验证出现问题

您好，请问我在使用您的模型训练公开的巴黎街景数据集保留的权重文件，在进行测试时和成的图片出现了这种请款

分布式训练

您好，该模型可以用单GPU训练吗？如果可以应该修改哪些参数？

Style Loss is the major player for good results and not Adversarial Loss.

Hi! Thanks for the excellent codebase!

I ran a few experiments to measure the importance of GAN in the current network. It turns out if we don't use style loss, the results are largely blurry. This makes me wonder about the importance of GAN. Could you help me there?

You can confirm the experiments here: https://github.com/praeclarumjj3/AOT-GAN-Experiments#results-using-the-testing-pconv-mask-dataset-without-style-loss

Also, I fixed the bugs present in the adv_loss as mentioned in #2.

Cannot test

I used the following command to execute test.py but no results appeared.

python src/test.py --dir_image data/pconv/ --pre_train experiments/aotgan_images_pconv512/G0001000.pt

Can you tell me how to handle it? Are there any parameters that I didn't provide?
Thank you very much!

training failed

got the error on training

Confuse about the my_layer_norm function and GAN loss function

Hi, thank you for your excellent work, I get a good result when run the demo. But when I read the source code, I get some problems about GAN loss function.

In the paper, the dis loss about fake_img should be self.loss_fn(d_fake, gauss(1 - mask)), but I find you just do gauss(mask), Is there something wrong with my understanding？
whatmore, the dis loss about real_img should be self.loss_fn(d_real, d_real_label), where d_real_label is torch.ones(...), but you write it to torch.zeros(...).
By the way, could you explain the work of my_layer_norm in the AOT block?
Thanks.

Own dataset

I get the following error:

images_masked = (images * (1 - masks).float()) + masks
RuntimeError: The size of tensor a (256) must match the size of tensor b (455) at non-singleton dimension 3

The images that I use have a size of 1280x720 (images and masks). I use "pconv" as the mask type. Any suggestion?

Cannot train the model

got the following error:
2024-03-10 10:12:48.171105: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-03-10 10:12:48.171155: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-03-10 10:12:48.172506: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-03-10 10:12:49.612674: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
[**] create folder ../experiments/aotgan_places2_pconv256
Traceback (most recent call last):
File "/content/drive/MyDrive/CODES/AOTGAN/src/train.py", line 51, in
main_worker(0, 1, args)
File "/content/drive/MyDrive/CODES/AOTGAN/src/train.py", line 30, in main_worker
trainer = Trainer(args)
File "/content/drive/MyDrive/CODES/AOTGAN/src/trainer/trainer.py", line 23, in init
self.dataloader = create_loader(args)
File "/content/drive/MyDrive/CODES/AOTGAN/src/data/init.py", line 14, in create_loader
data_loader = DataLoader(
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 349, in init
sampler = RandomSampler(dataset, generator=generator) # type: ignore[arg-type]
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/sampler.py", line 140, in init
raise ValueError(f"num_samples should be a positive integer value, but got num_samples={self.num_samples}")
ValueError: num_samples should be a positive integer value, but got num_samples=0

Do the traning masks need to be hand-crafted generated？

Hi Yanhong,
I have read your paper on arxive. An excellent work. I noticed that the traning masks are paired with input images mentioned in your manuscript. But in the source code option.py, I saw the dir_mask path pre-definition, which is also seen in the readme file. Do I need to generate the masks first for training?

Loading pre-trained model throws error

After adding the downloaded pretrained model into experiments/, running demo.py throws

IsADirectoryError: [Errno 21] Is a directory: 'experiments/'