ustc-jialunpeng / diverse-structure-inpainting Goto Github PK

View Code? Open in Web Editor NEW

175.0 8.0 19.0 9.81 MB

CVPR 2021: "Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE"

License: MIT License

Python 100.00%

tensorflow inpainting vq-vae autoregressive-neural-networks multimodal attention generative-adversarial-networks

diverse-structure-inpainting's People

Contributors

Stargazers

Watchers

Forkers

mfkiwl orange1999 yeating saeidnp daydreamer2023 jawadmansoor huangwenwenlili cballester filterbank styler00dollar rockfordwei zhaojia1984 qcymkxyc rococostudio murphy-404 irine1210 qsc227 kk2w tumble-weed

diverse-structure-inpainting's Issues

image processing

Could you please elaborate a little on input image size and size of mask for respective datasets? like celeb input image is 266x266 and the mask is written 256x256, which seems to be wrong.

Specifically, could you share the input image size places 2, and the size of the mask?

In training:
https://github.com/USTC-JialunPeng/Diverse-Structure-Inpainting/blob/main/README.md#:~:text=Modify%20data/data_loader.py%20according%20to%20the%20dataset.%20For%20CelebA-HQ%2C%20we%20resize%20each%20image%20to%20266x266%20and%20randomly%20crop%20a%20256x256.%20For%20Places2%20and%20ImageNet%2C%20we%20randomly%20crop%20a%20256x256

In testing
https://github.com/USTC-JialunPeng/Diverse-Structure-Inpainting/blob/main/README.md#:~:text=Collect%20the%20testing%20set.%20For%20CelebA-HQ%2C%20we%20resize%20each%20image%20to%20256x256.%20For%20Places2%20and%20ImageNet%2C%20we%20crop%20a%20center%20256x256.

I have tested the pre-trained model and found the results to be very off, attaching two sample-generated images.

taking 200 s to process image instead of 57s it took previously

Colab Demo? Inference on Custom Images ?

Hello, thank you for your amazing implementation
Can you provide a demo file or a colab notebook for custom images, it will be much appreciated and it'll help everyone,

Thank you

Parameter Settings for random masks

How to set max_delta and margins when using random masks?

Hi,
you mention in the README : "For Places2 and ImageNet, we crop a center 256x256." It seems that in your test.py code, you resize all images to 256x256 and don't do any cropping. What did you do for your results in the paper ?

Mask images and file lists

Thank you for sharing your great work!

I want to run test.py using your pretrained models. Do I need to prepare the mask images and file lists by myself? If you have the code to generate them, would you mind sharing it?

Assign requires shapes of both tensors to match. lhs shape= [256] rhs shape= [512]

I use a data set with picture sizes of 256*256 and use the pre-trained model you gave for testing. The program reports such an error. Do you know what I did wrong?
Thank you.

bad result is why?

error in train_structure_generator.py and train_generator.py

I am trying to train data on vggface2 faces and getting error
TypeError: Input 'filename' of 'ReadFile' Op has type float32 that does not match expected type of string.
kindly help asap

matrics about permission denied

It could work when I first calculated. But when I try other pictures, same way but come up to new issue

The multiple solutions

How are the multiple solutions mentioned in your article reflected in the training and testing process?

Why concatenating a matrix of ones?

Hi and first of all thanks for the great code!

Can I ask why you stack a matrix of ones in the penultimate channel here?

Diverse-Structure-Inpainting/net/structure_generator.py

Line 30 in 467b7a9

x = tf.concat([x, ones_x, ones_x*mask], 3)

Wouldn't it make more sense to have x = tf.concat([x, ones_x*(1-mask), ones_x*mask], 3) ?
(i.e. you concat the mask as its one-hot encoding).

Best,
P.

The problem that GPU(A4000) does not adapt to tf1.12 version

The paper is exciting! Thanks to the authors. The program runs perfectly on the GPUs(two 2080), but the same version(TensorFlow-gpu=1.12.0 cuda=9.0 cudnn=7.6.5) cannot be adapted to the two A4000. Does anyone have a solution to the version problem?

Genearting unmasked faces from faces

Hi! I am using the model for generating new faces from masked faces. I have run test.py and modify the code a little for removing face mask.

The model is not performing on CelebA masked faces , kindly guide for training for masked face, do i need to pass masks ?

why the inference need to take so much time, one image need more 1 minute?

train_vqvae.py and train_texture_generator.py can not accelerated by GPU ?

Hi, thank you for your great work!
When training the vqvae network, the training time is very long, could you tell me how to accelerate? Thanks for replying!

Inference Time is Too long

Thanks for your amazing job, the result is amazing. But the inference time is such long.
On 32G V100 the inference time is 34ms. Is it normal？

Insufficient video memory

Training structure_generator.In order to solve the problem of insufficient video memory during generator, reducing batch_size leads to an error when saving the training results : Index 1 is out of bounds for axis 0 with size 1.I hope I can get your help. Thank you!

训练问题

您好，模型可以在windows系统上训练吗？

Long processing time

I have used repo last year and use in my research paper for analysis, I need to re-run the code. Now after using GPU - premium on colab it is taking more than 4 minutes to process. I am using pre-trained celeba-random model downloaded from repo.

URGENT HELP

occur indexError

hello,when I use your code training, it occurs indexError:
in this part:
for i in range(4):
gt_i = ((gt[i] + 1.) * 127.5).astype(np.uint8)
masked_i = ((masked[i] + 1.) * 127.5).astype(np.uint8)
complete_i = ((complete[i] + 1.) * 127.5).astype(np.uint8)
recons_gt_i = ((recons_gt[i] + 1.) * 127.5).astype(np.uint8)

train_structure_generator.py：
Traceback (most recent call last):
File "train_structure_generator.py", line 366, in
nn.structure_visual(gt_np, masked_np, recons_gen_np, recons_gt_np, (i + 1), args.image_size, folder_path)
File "D:\pythonProject\5_19\VQ_VAE\net\nn.py", line 173, in structure_visual
gt_i = ((gt[i] + 1.) * 127.5).astype(np.uint8)
IndexError: index 1 is out of bounds for axis 0 with size 1

How many dimensions is the parameter gt? What data parameter is gt?thank you

关于测试的问题。

报错：
InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [3,3,3,64] rhs shape= [4,4,3,64]
[[node save/Assign_958 (defined at test.py:182) = Assign[T=DT_FLOAT, _class=["loc:@vq_encoder/conv2d_0/W"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](vq_encoder/conv2d_0/W, save/RestoreV2/_1917)]]
[[{{node save/RestoreV2/_1374}} = _SendT=DT_FLOAT, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_1380_save/RestoreV2", _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

查阅相关资料，说模型分类不一致，导致的？用的是作者您给的与训练模型，测试数据是celeba-hq,其他都是按着步骤做的，请问问题出在哪呢？感谢博主回答！

filelist

I am a newcomer in this field and am very interested in your program. Can you tell me how filelist is generated? Can you publish your code? Thank you.

Taking very long time to run test model.

hello sir

I am simply running the code
as per described specifications but the cursor is delayed on run result command for a very long time say 15 to 20 min and I need to interrupt cell.
I am using Google Pro GPU version which again and again purchase is expensive. So please help me in reconsrtucting faces asap

Tensorflow version problem

23-07-14 11:47:14.743679: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-07-14 11:47:15.646115: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Traceback (most recent call last):
File "/content/drive/MyDrive/Diverse-Structure-Inpainting/train_vqvae.py", line 9, in
from net.vqvae import vq_encoder_spec, vq_decoder_spec
File "/content/drive/MyDrive/Diverse-Structure-Inpainting/net/vqvae.py", line 3, in
from tensorflow.contrib.framework.python.ops import arg_scope
ModuleNotFoundError: No module named 'tensorflow.contrib'

How to resolve version problem

Use of VQ-VAE in test.py

I am wondering if there is a discrepancy between the how inference is explained in the paper and how it is implemented in test.py. To be specific, the paper says "During inference, only Gs and Gt are used." but the VQ-VAE model is loaded in test.py and used to do inference (see here and here for example).

I am new to tensorflow and not super familiar with VQ-VAE, so I might be missing something, but I think since the VQ-VAE encoder is getting the full image as input (see here) there might information leak from the full image to the inpainting module at inference time. Please correct me if I'm wrong.

Thank you.