Code Monkey home page Code Monkey logo

aot-gan-for-inpainting's Introduction

AOT-GAN for High-Resolution Image Inpainting

aotgan

AOT-GAN: Aggregated Contextual Transformations for High-Resolution Image Inpainting
Yanhong Zeng, Jianlong Fu, Hongyang Chao, and Baining Guo.

Citation

If any part of our paper and code is helpful to your work, please generously cite and star us 😘 😘 😘 !

@inproceedings{yan2021agg,
  author = {Zeng, Yanhong and Fu, Jianlong and Chao, Hongyang and Guo, Baining},
  title = {Aggregated Contextual Transformations for High-Resolution Image Inpainting},
  booktitle = {Arxiv},
  pages={-},
  year = {2020}
}

Introduction

Despite some promising results, it remains challenging for existing image inpainting approaches to fill in large missing regions in high resolution images (e.g., 512x512). We analyze that the difficulties mainly drive from simultaneously inferring missing contents and synthesizing fine-grained textures for a extremely large missing region. We propose a GAN-based model that improves performance by,

  1. Enhancing context reasoning by AOT Block in the generator. The AOT blocks aggregate contextual transformations with different receptive fields, allowing to capture both informative distant contexts and rich patterns of interest for context reasoning.
  2. Enhancing texture synthesis by SoftGAN in the discriminator. We improve the training of the discriminator by a tailored mask-prediction task. The enhanced discriminator is optimized to distinguish the detailed appearance of real and synthesized patches, which can in turn facilitate the generator to synthesize more realistic textures.

Results

face_object logo

Prerequisites

  • python 3.8.8
  • pytorch (tested on Release 1.8.1)

Installation

Clone this repo.

git clone [email protected]:researchmm/AOT-GAN-for-Inpainting.git
cd AOT-GAN-for-Inpainting/

For the full set of required Python packages, we suggest create a Conda environment from the provided YAML, e.g.

conda env create -f environment.yml
conda activate inpainting

Datasets

  1. download images and masks
  2. specify the path to training data by --dir_image and --dir_mask.

Getting Started

  1. Training:
    • Our codes are built upon distributed training with Pytorch.
    • Run
    cd src
    python train.py
    
  2. Resume training:
    cd src
    python train.py --resume
    
  3. Testing:
    cd src
    python test.py --pre_train [path to pretrained model]
    
  4. Evaluating:
    cd src
    python eval.py --real_dir [ground truths] --fake_dir [inpainting results] --metric mae psnr ssim fid
    

Pretrained models

CELEBA-HQ | Places2

Download the model dirs and put it under experiments/

Demo

  1. Download the pre-trained model parameters and put it under experiments/
  2. Run by
cd src
python demo.py --dir_image [folder to images]  --pre_train [path to pre_trained model] --painter [bbox|freeform]
  1. Press '+' or '-' to control the thickness of painter.
  2. Press 'r' to reset mask; 'k' to keep existing modifications; 's' to save results.
  3. Press space to perform inpainting; 'n' to move to next image; 'Esc' to quit demo.

face logo

TensorBoard

Visualization on TensorBoard for training is supported.

Run tensorboard --logdir [log_folder] --bind_all and open browser to view training progress.

LICENSE

This project is released under the Apache 2.0 license. Please see the LICENSE file for more information.

Acknowledgements

We would like to thank edge-connect, EDSR_PyTorch.

aot-gan-for-inpainting's People

Contributors

zengyh1900 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

aot-gan-for-inpainting's Issues

Unable to train

I can't use my own training set, I prepared some 512*512 images for training, but my progress bar keeps staying at 0,
like this:

[**] create folder ../experiments/aotgan_places2_pconv512
UserWarning: The parameter 'pretrained' is deprecated since 0.13 and will be removed in 0.15, please use 'weights' instead.
warnings.warn(
UserWarning: Arguments other than a weight enum or None for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passing weights=VGG19_Weights.IMAGENET1K_V1. You can also use weights=VGG19_Weights.DEFAULT to get the most up-to-date weights.
warnings.warn(msg)
0%| | 0/1000000 [00:00<?, ?it/s]

Picking the right model for random use-case

Hey there,
Loving the paper!

Noobie question:
I'm wondering, in case I want to choose a random image (not knowing the content) and matching the right pre-trained model to it, is there a way to automate the process or I'll have to detect what the is image metadata to sync it with the right model?
Is it possible to merge the models, so I can feed it with a single output and expect it to handle inpainting even without a context?

用celeba_256数据集从头训练为什么无法收敛?

感谢作者的贡献!请问我用celeba_256从scratch开始训练,但每张图片对应一张固定的mask,且输入图含噪声,算loss时用的是clean的GT,batch_size调成16,iteration为1e6,为什么一直没有收敛呢?
最后的log如下:
mt:7.5s, dt:0.0s, L1:0.390, Style:0.700, Perceptual:0.389, advg:0.006, advd:0.000, : 100%|█████████▉| 999666/1000000 [139:05:29<04:09, 1.34it/s]

Cannot reproduce results from paper with pre-trained model

Hi !

Thanks for your work, the demo code is neat 🤗
I tried the celebahq pre-trained models on images and noticed it doesn't really "complete" parts of the face when you erase them completely (such as eyes, eyebrows, mouth). I tried the same masks you showed in your paper and could not reach the same quality.

This is best I could get for example : am I missing something ?
test_masked
test_pred

c++

Very great work, does it support c++ inference?

使用训练权重验证出现问题

您好,请问我在使用您的模型训练公开的巴黎街景数据集保留的权重文件,在进行测试时和成的图片出现了这种请款
001_im_comp
001_im_pred

分布式训练

您好,该模型可以用单GPU训练吗?如果可以应该修改哪些参数?

Style Loss is the major player for good results and not Adversarial Loss.

Hi! Thanks for the excellent codebase!

I ran a few experiments to measure the importance of GAN in the current network. It turns out if we don't use style loss, the results are largely blurry. This makes me wonder about the importance of GAN. Could you help me there?

You can confirm the experiments here: https://github.com/praeclarumjj3/AOT-GAN-Experiments#results-using-the-testing-pconv-mask-dataset-without-style-loss

Also, I fixed the bugs present in the adv_loss as mentioned in #2.

Cannot test

I used the following command to execute test.py but no results appeared.

python src/test.py --dir_image data/pconv/ --pre_train experiments/aotgan_images_pconv512/G0001000.pt

Can you tell me how to handle it? Are there any parameters that I didn't provide?
Thank you very much!

Confuse about the my_layer_norm function and GAN loss function

Hi, thank you for your excellent work, I get a good result when run the demo. But when I read the source code, I get some problems about GAN loss function.
image
In the paper, the dis loss about fake_img should be self.loss_fn(d_fake, gauss(1 - mask)), but I find you just do gauss(mask), Is there something wrong with my understanding?
whatmore, the dis loss about real_img should be self.loss_fn(d_real, d_real_label), where d_real_label is torch.ones(...), but you write it to torch.zeros(...).
By the way, could you explain the work of my_layer_norm in the AOT block?
Thanks.

Own dataset

I get the following error:

images_masked = (images * (1 - masks).float()) + masks
RuntimeError: The size of tensor a (256) must match the size of tensor b (455) at non-singleton dimension 3

The images that I use have a size of 1280x720 (images and masks). I use "pconv" as the mask type. Any suggestion?

Cannot train the model

got the following error:
2024-03-10 10:12:48.171105: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-03-10 10:12:48.171155: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-03-10 10:12:48.172506: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-03-10 10:12:49.612674: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
[**] create folder ../experiments/aotgan_places2_pconv256
Traceback (most recent call last):
File "/content/drive/MyDrive/CODES/AOTGAN/src/train.py", line 51, in
main_worker(0, 1, args)
File "/content/drive/MyDrive/CODES/AOTGAN/src/train.py", line 30, in main_worker
trainer = Trainer(args)
File "/content/drive/MyDrive/CODES/AOTGAN/src/trainer/trainer.py", line 23, in init
self.dataloader = create_loader(args)
File "/content/drive/MyDrive/CODES/AOTGAN/src/data/init.py", line 14, in create_loader
data_loader = DataLoader(
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 349, in init
sampler = RandomSampler(dataset, generator=generator) # type: ignore[arg-type]
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/sampler.py", line 140, in init
raise ValueError(f"num_samples should be a positive integer value, but got num_samples={self.num_samples}")
ValueError: num_samples should be a positive integer value, but got num_samples=0

Do the traning masks need to be hand-crafted generated?

Hi Yanhong,
I have read your paper on arxive. An excellent work. I noticed that the traning masks are paired with input images mentioned in your manuscript. But in the source code option.py, I saw the dir_mask path pre-definition, which is also seen in the readme file. Do I need to generate the masks first for training?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.