Code Monkey home page Code Monkey logo

co-mod-gan's Introduction

Large Scale Image Completion via Co-Modulated Generative Adversarial Networks, ICLR 2021 (Spotlight)

[NEW!] Another unofficial demo is available!

[NOTICE] Our web demo will be closed recently. Enjoy the last days!

[NEW!] Time to play with our interactive web demo!

Numerous task-specific variants of conditional generative adversarial networks have been developed for image completion. Yet, a serious limitation remains that all existing algorithms tend to fail when handling large-scale missing regions. To overcome this challenge, we propose a generic new approach that bridges the gap between image-conditional and recent modulated unconditional generative architectures via co-modulation of both conditional and stochastic style representations. Also, due to the lack of good quantitative metrics for image completion, we propose the new Paired/Unpaired Inception Discriminative Score (P-IDS/U-IDS), which robustly measures the perceptual fidelity of inpainted images compared to real images via linear separability in a feature space. Experiments demonstrate superior performance in terms of both quality and diversity over state-of-the-art methods in free-form image completion and easy generalization to image-to-image translation.

Large Scale Image Completion via Co-Modulated Generative Adversarial Networks
Shengyu Zhao, Jonathan Cui, Yilun Sheng, Yue Dong, Xiao Liang, Eric I Chang, Yan Xu
Tsinghua University and Microsoft Research
arXiv | OpenReview

Overview

This repo is implemented upon and has the same dependencies as the official StyleGAN2 repo. We also provide a Dockerfile for Docker users. This repo currently supports:

  • Large scale image completion experiments on FFHQ and Places2
  • Image-to-image translation experiments on Edges2Shoes and Edges2Handbags
  • Image-to-image translation experiments on COCO-Stuff
  • Evaluation code of Paired/Unpaired Inception Discriminative Score (P-IDS/U-IDS)

Datasets

  • FFHQ dataset (in TFRecords format) can be downloaded following the StyleGAN2 repo.
  • Places2 dataset can be downloaded in this website (Places365-Challenge 2016 high-resolution images, training set and validation set). The raw images should be converted into TFRecords using dataset_tools/create_from_images.py with --shuffle --compressed.
  • Edges2Shoes and Edges2Handbags datasets can be downloaded following the pix2pix repo. The raw images should be converted into TFRecords using dataset_tools/create_from_images.py with --shuffle --pix2pix.
  • To prepare a custom dataset, please use dataset_tools/create_from_images.py, which will automatically center crop and resize your images to the specified resolution. You only need to specify --val-image-dir for testing purpose.

Training

The following script is for training on FFHQ. It will split 10k images for validation. We recommend using 8 NVIDIA Tesla V100 GPUs for training. Training at 512x512 resolution takes about 1 week.

python run_training.py --data-dir=DATA_DIR --dataset=DATASET --metrics=ids10k --mirror-augment --num-gpus=8

The following script is for training on Places2 at resolution 512x512 (resolution must be specified when training on compressed dataset), which has a validation set of 36500 images:

python run_training.py --data-dir=DATA_DIR --dataset=DATASET --resolution=512 --metrics=ids36k5 --total-kimg 50000 --num-gpus=8

The following script is for training on Edges2Handbags (and similarly for Edges2Shoes):

python run_training.py --data-dir=DATA_DIR --dataset=DATASET --metrics=fid200-rt-handbags --mirror-augment --num-gpus=8

Pre-Trained Models

Our pre-trained models are available on Google Drive:

Model name & URL Description
co-mod-gan-ffhq-9-025000.pkl Large scale image completion on FFHQ (512x512)
co-mod-gan-ffhq-10-025000.pkl Large scale image completion on FFHQ (1024x1024)
co-mod-gan-places2-050000.pkl Large scale image completion on Places2 (512x512)
co-mod-gan-coco-stuff-025000.pkl Image-to-image translation on COCO-Stuff (labels to photos) (512x512)
co-mod-gan-edges2shoes-025000.pkl Image-to-image translation on edges2shoes (256x256)
co-mod-gan-edges2handbags-025000.pkl Image-to-image translation on edges2handbags (256x256)

Use the following script to run the interactive demo locally:

python run_demo.py -d DATA_DIR/DATASET -c CHECKPOINT_FILE(S)

or the following command as a minimal example of usage:

python run_generator.py -c CHECKPOINT_FILE -i imgs/example_image.jpg -m imgs/example_mask.jpg -o imgs/example_output.jpg

Evaluation

The following script is for evaluation:

python run_metrics.py --data-dir=DATA_DIR --dataset=DATASET --network=CHECKPOINT_FILE(S) --metrics=METRIC(S) --num-gpus=1

Commonly used metrics are ids10k and ids36k5 (for FFHQ and Places2 respectively), which will compute P-IDS and U-IDS together with FID. By default, masks are generated randomly for evaluation, or you may append the metric name with -h0 ([0.0, 0.2]) to -h4 ([0.8, 1.0]) to specify the range of masked ratio.

Citation

If you find this code helpful, please cite our paper:

@inproceedings{zhao2021comodgan,
  title={Large Scale Image Completion via Co-Modulated Generative Adversarial Networks},
  author={Zhao, Shengyu and Cui, Jonathan and Sheng, Yilun and Dong, Yue and Liang, Xiao and Chang, Eric I and Xu, Yan},
  booktitle={International Conference on Learning Representations (ICLR)},
  year={2021}
}

co-mod-gan's People

Contributors

zsyzzsoft avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

co-mod-gan's Issues

Image Sequence Noise

Hi,
first of all thanks you for this brilliant work!

Is there a way to use this for a sequence of images with some kind of temporal consistency ? I`m looking for a way to force the generator to always sample from the same latent noise so that the outputs are always the the same. Like a predefined seed to be able to regenerate the same results. Any idea? Would this also be possible with an already pretrained model ?

Many Thanks!

How to control style truncation?

Hi! Thank you for releasing this! Really amazing results!

I'm trying to evaluate the model on my data and I'm curious, if there is a way to tune truncation trick in run_generator.py? I'd like to buy as much quality as possible and I'm not very interested in diversity.

Should I pass something as input to Gs.run?

General to arbitary image size

Very impressive work!
Does this algorithm support image inpainting of arbitray image size?
It seems to have errors if directly dealing with image size that is not 512X512.

some bad cases

Hi,
I have used the co-mod-gan-places2-050000.pkl model to restore some food images. In most cases, the result is really good. However, there are some bad cases. How can I fix them? Just retrain the model with some food images? Or something else?

Here are some badcase. The mask is in the middle of picture:

1619575587600-3342b61a-d6ce-443e-b69d-bb15a3ab4031

1619575495482-da0d3c34-5271-4ae3-ad30-348ec9008a01

Data Prep

Can you explain how the dataset was prepared for Co Mod GAN's like the pix2pix requires the photos to be indexed left being the input right being the output, like that, how is the dataset prepared here. And what should I do to apply this on a custom dataset

Setting up TensorFlow plugin "fused_bias_act.cu": Preprocessing... Failed!

Setting up TensorFlow plugin "fused_bias_act.cu": Preprocessing... Failed!
Traceback (most recent call last):
File "run_generator.py", line 34, in
main()
File "run_generator.py", line 31, in main
create_from_images(**vars(args))
File "run_generator.py", line 14, in create_from_images
_, _, Gs = misc.load_pkl(checkpoint)
File "D:\Project\Inpainting\co-mod-gan\training\misc.py", line 30, in load_pkl
return pickle.load(file, encoding='latin1')
File "D:\Project\Inpainting\co-mod-gan\dnnlib\tflib\network.py", line 297, in setstate
self._init_graph()
File "D:\Project\Inpainting\co-mod-gan\dnnlib\tflib\network.py", line 154, in _init_graph
out_expr = self._build_func(*self.input_templates, **build_kwargs)
File "", line 391, in G_synthesis_RegionGAN
File "", line 350, in E_fromrgb
File "", line 68, in apply_bias_act
File "D:\Project\Inpainting\co-mod-gan\dnnlib\tflib\ops\fused_bias_act.py", line 68, in fused_bias_act
return impl_dict[impl](x=x, b=b, axis=axis, act=act, alpha=alpha, gain=gain)
File "D:\Project\Inpainting\co-mod-gan\dnnlib\tflib\ops\fused_bias_act.py", line 122, in _fused_bias_act_cuda
cuda_kernel = _get_plugin().fused_bias_act
File "D:\Project\Inpainting\co-mod-gan\dnnlib\tflib\ops\fused_bias_act.py", line 16, in _get_plugin
return custom_ops.get_plugin(os.path.splitext(file)[0] + '.cu')
File "D:\Project\Inpainting\co-mod-gan\dnnlib\tflib\custom_ops.py", line 112, in get_plugin
_run_cmd(_prepare_nvcc_cli('"%s" --preprocess -o "%s" --keep --keep-dir "%s"' % (cuda_file, tmp_file, tmp_dir)))
File "D:\Project\Inpainting\co-mod-gan\dnnlib\tflib\custom_ops.py", line 61, in _run_cmd
raise RuntimeError('NVCC returned an error. See below for full command line and output log:\n\n%s\n\n%s' % (cmd, output))
RuntimeError: NVCC returned an error. See below for full command line and output log:

nvcc "D:\Project\Inpainting\co-mod-gan\dnnlib\tflib\ops\fused_bias_act.cu" --preprocess -o "C:\Users\ADMINI1\AppData\Local\Temp\tmpp_f9al3w\fused_bias_act_tmp.cu" --keep --keep-dir "C:\Users\ADMINI1\AppData\Local\Temp\tmpp_f9al3w" --
disable-warnings --include-path "C:\Program Files\tensorflow\include" --include-path "C:\Program Files\tensorflow\include\external\protobuf_archive\src" --include-path "C:\Program Files\tensorflow\include\external\com_google_absl" --in
clude-path "C:\Program Files\tensorflow\include\external\eigen_archive" --compiler-bindir "C:/Program Files (x86)/Microsoft Visual Studio 14.0/vc/bin" 2>&1

fused_bias_act.cu
D:/Project/Inpainting/co-mod-gan/dnnlib/tflib/ops/fused_bias_act.cu(9): fatal error C1083: Cannot open include file: 'tensorflow/core/framework/op.h': No such file or directory

I installed tensorflow-gpu 1.14 on windows pc, please help me with that

What if the output layer of the generator is activated by "tanh" ?

Hi, I notice that the last layer of the generator is activated by "linear", etc., the range of pixel values might stretch across (-infinite, +infinite) interval. Is there any advice or reasons for using "tanh" instead of "linear", so that pixel values is limited to (-1, +1) interval.

How to train with my own datasets?

I want to train on other face datasets. After I use dataset_tools to modify the format, how do I train?
How to modify "--dataset=DATASET"?

Why does downloading exist during training? Training is blocked.

Local submit - run_dir: logs/00003-co-mod-gan-places2_tfrecord-1gpu
dnnlib: Running training.training_loop.training_loop() on localhost...
Streaming data using training.dataset.TFRecordDataset...
zzzzzzzzzzzzzzz: int64_list {
value: 3
value: 512
value: 512
}

zzzzzzzzzzzzzzzzz: [3, 512, 512]
Dataset shape = [3, 512, 512]
Dynamic range = [0, 255]
Label size = 0
Constructing networks...
Setting up TensorFlow plugin "fused_bias_act.cu": Preprocessing... Compiling... Loading... Done.
Setting up TensorFlow plugin "upfirdn_2d.cu": Preprocessing... Compiling... Loading... Done.

G Params OutputShape WeightShape


latents_in - (?, 512) -
labels_in - (?, 0) -
images_in - (?, 3, 512, 512) -
masks_in - (?, 1, 512, 512) -
lod - () -
dlatent_avg - (512,) -
G_mapping/latents_in - (?, 512) -
G_mapping/labels_in - (?, 0) -
G_mapping/Normalize - (?, 512) -
G_mapping/Dense0 262656 (?, 512) (512, 512)
G_mapping/Dense1 262656 (?, 512) (512, 512)
G_mapping/Dense2 262656 (?, 512) (512, 512)
G_mapping/Dense3 262656 (?, 512) (512, 512)
G_mapping/Dense4 262656 (?, 512) (512, 512)
G_mapping/Dense5 262656 (?, 512) (512, 512)
G_mapping/Dense6 262656 (?, 512) (512, 512)
G_mapping/Dense7 262656 (?, 512) (512, 512)
G_mapping/Broadcast - (?, 16, 512) -
G_mapping/dlatents_out - (?, 16, 512) -
Truncation/Lerp - (?, 16, 512) -
G_synthesis 23541095 (?, 3, 512, 512) (3, 3, 256, 256)
images_out - (?, 3, 512, 512) -


Total 25642343

D Params OutputShape WeightShape


images_in - (?, 3, 512, 512) -
labels_in - (?, 0) -
masks_in - (?, 1, 512, 512) -
sub - (?, 1, 512, 512) -
512x512/FromRGB 320 (?, 64, 512, 512) (1, 1, 4, 64)
512x512/Conv0 36928 (?, 64, 512, 512) (3, 3, 64, 64)
512x512/Conv1_down 73856 (?, 128, 256, 256) (3, 3, 64, 128)
512x512/Skip 8192 (?, 128, 256, 256) (1, 1, 64, 128)
256x256/Conv0 147584 (?, 128, 256, 256) (3, 3, 128, 128)
256x256/Conv1_down 295168 (?, 256, 128, 128) (3, 3, 128, 256)
256x256/Skip 32768 (?, 256, 128, 128) (1, 1, 128, 256)
128x128/Conv0 590080 (?, 256, 128, 128) (3, 3, 256, 256)
128x128/Conv1_down 1180160 (?, 512, 64, 64) (3, 3, 256, 512)
128x128/Skip 131072 (?, 512, 64, 64) (1, 1, 256, 512)
64x64/Conv0 2359808 (?, 512, 64, 64) (3, 3, 512, 512)
64x64/Conv1_down 2359808 (?, 512, 32, 32) (3, 3, 512, 512)
64x64/Skip 262144 (?, 512, 32, 32) (1, 1, 512, 512)
32x32/Conv0 2359808 (?, 512, 32, 32) (3, 3, 512, 512)
32x32/Conv1_down 2359808 (?, 512, 16, 16) (3, 3, 512, 512)
32x32/Skip 262144 (?, 512, 16, 16) (1, 1, 512, 512)
16x16/Conv0 2359808 (?, 512, 16, 16) (3, 3, 512, 512)
16x16/Conv1_down 2359808 (?, 512, 8, 8) (3, 3, 512, 512)
16x16/Skip 262144 (?, 512, 8, 8) (1, 1, 512, 512)
8x8/Conv0 2359808 (?, 512, 8, 8) (3, 3, 512, 512)
8x8/Conv1_down 2359808 (?, 512, 4, 4) (3, 3, 512, 512)
8x8/Skip 262144 (?, 512, 4, 4) (1, 1, 512, 512)
4x4/MinibatchStddev - (?, 513, 4, 4) -
4x4/Conv 2364416 (?, 512, 4, 4) (3, 3, 513, 512)
4x4/Dense0 4194816 (?, 512) (8192, 512)
Output 513 (?, 1) (512, 1)
scores_out - (?, 1) -


Total 28982913

Building TensorFlow graph...
Initializing logs...
Training for 50000 kimg...

tick 0 kimg 0.1 lod 0.00 minibatch 32 time 1m 23s sec/tick 82.7 sec/kimg 646.15 maintenance 0.0 gpumem 8.2
truncation=None
Downloading https://drive.google.com/uc?id=1MzTY44rLToO5APn8TZmfR7_ENSe5aZUn ....

How to test the sample images?

When I ran the below code,
!python run_generator.py -c /content/drive/MyDrive/co-mod-gan-ffhq-10-025000.pkl -i imgs/example_image.jpg -m imgs/example_mask.jpg -o imgs/example_output.jpg
This came as a result.
image

And this exception was caught

image
Any ideas how to solve this?

run_generator.py throws rank do not match error

I was trying to use the run_generator.py on example images in imgs/ and ended up running into this error.

!python run_generator.py -c /content/drive/MyDrive/co-mod-gan-ffhq-9-025000.pkl -i imgs/example_image.jpg -m imgs/example_mask.jpg -o output.jpg
Setting up TensorFlow plugin "fused_bias_act.cu": Preprocessing... Loading... Done.
Setting up TensorFlow plugin "upfirdn_2d.cu": Preprocessing... Loading... Done.
Traceback (most recent call last):
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/client/session.py", line 1365, in _do_call
    return fn(*args)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/client/session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
  (0) Invalid argument: ConcatOp : Ranks of all input tensors should match: shape[0] = [1,512,512] vs. shape[1] = [1,3,512,512]
	 [[{{node Gs/_Run/Gs/G_synthesis/concat}}]]
	 [[Gs/_Run/Gs/images_out/_1587]]
  (1) Invalid argument: ConcatOp : Ranks of all input tensors should match: shape[0] = [1,512,512] vs. shape[1] = [1,3,512,512]
	 [[{{node Gs/_Run/Gs/G_synthesis/concat}}]]
0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "run_generator.py", line 35, in <module>
    main()
  File "run_generator.py", line 32, in main
    create_from_images(**vars(args))
  File "run_generator.py", line 18, in create_from_images
    fake = Gs.run(latent, None, real[np.newaxis], mask[np.newaxis])[0]
  File "/content/co-mod-gan/dnnlib/tflib/network.py", line 442, in run
    mb_out = tf.get_default_session().run(out_expr, dict(zip(in_expr, mb_in)))
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/client/session.py", line 956, in run
    run_metadata_ptr)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/client/session.py", line 1180, in _run
    feed_dict_tensor, options, run_metadata)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/client/session.py", line 1359, in _do_run
    run_metadata)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/client/session.py", line 1384, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
  (0) Invalid argument: ConcatOp : Ranks of all input tensors should match: shape[0] = [1,512,512] vs. shape[1] = [1,3,512,512]
	 [[node Gs/_Run/Gs/G_synthesis/concat (defined at /tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/ops.py:1748) ]]
	 [[Gs/_Run/Gs/images_out/_1587]]
  (1) Invalid argument: ConcatOp : Ranks of all input tensors should match: shape[0] = [1,512,512] vs. shape[1] = [1,3,512,512]
	 [[node Gs/_Run/Gs/G_synthesis/concat (defined at /tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/ops.py:1748) ]]
0 successful operations.
0 derived errors ignored.

Original stack trace for 'Gs/_Run/Gs/G_synthesis/concat':
  File "run_generator.py", line 35, in <module>
    main()
  File "run_generator.py", line 32, in main
    create_from_images(**vars(args))
  File "run_generator.py", line 18, in create_from_images
    fake = Gs.run(latent, None, real[np.newaxis], mask[np.newaxis])[0]
  File "/content/co-mod-gan/dnnlib/tflib/network.py", line 417, in run
    out_gpu = net_gpu.get_output_for(*in_gpu, return_as_list=True, **dynamic_kwargs)
  File "/content/co-mod-gan/dnnlib/tflib/network.py", line 221, in get_output_for
    out_expr = self._build_func(*final_inputs, **build_kwargs)
  File "<string>", line 240, in G_main
  File "/content/co-mod-gan/dnnlib/tflib/network.py", line 221, in get_output_for
    out_expr = self._build_func(*final_inputs, **build_kwargs)
  File "<string>", line 387, in G_synthesis_RegionGAN
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/util/dispatch.py", line 180, in wrapper
    return target(*args, **kwargs)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/ops/array_ops.py", line 1420, in concat
    return gen_array_ops.concat_v2(values=values, axis=axis, name=name)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/ops/gen_array_ops.py", line 1257, in concat_v2
    "ConcatV2", values=values, axis=axis, name=name)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
    op_def=op_def)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/ops.py", line 3357, in create_op
    attrs, op_def, compute_device)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
    op_def=op_def)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/ops.py", line 1748, in __init__
    self._traceback = tf_stack.extract_stack()```

dockerfile parse error

Very cool project - thanks for releasing your code! Just a heads up that the readme says:

We also provide a Dockerfile for Docker users.

But the Dockerfile in the repo is an empty text file.

Edit: Actually it's a binary file as mentioned by zsyzzsoft, below.

How to make custom pix2pix dataset and training

Hi,

Thanks for your impressive code.

I meet a problem about pix2pix when I try to use co-mod-gan to train my dataset. Could you tell me how to make custom pix2pix dataset that can be used in the program. Thank you very much.

sincerely yours
Bell

Reproducing COCO-stuff

Hi,

thanks for sharing this great code base including pre-trained models! However, I would like to train the COCO-stuff model from scratch. Can you give me some instructions on what to do? I just want to make sure that I do not miss sth.

Thanks,
Simon

Custom data training

Hello,

Is it possible to do a custom training on my dataset (not FFHQ)? Also, if it is possible, can you add a small tutorial section in the README,md so that it would be helpful for others?

Thanks

Use of Normalization Layers in Encoder

Hi,

First of all, thanks for sharing this great work!

My question is about using normalization blocks (batch norm, instance norm, etc.) in the Encoder. I did not find any normalization layer when I investigated the source code. Is this really the case, or am I missing something?

If this is the case, is there any particular reason to omit it?

Getting IndexError while Running run_metrics.py with Places2

Hi,

Thanks for sharing this great work!

I have a question about running run_metrics.py with Places2 dataset.

As it is indicated in Datasets section, I downloaded the validation set of Places2, and converted into TFRecords with --shuffle --compressed flags.

However, when I try to run `run_metrics.py' (I set metrics=idk36k5), I got the following:

dnnlib: Running run_metrics.run() on localhost...
Evaluating metrics "ids36k5" for "models/co-mod-gan-places2-050000.pkl"...
Setting up TensorFlow plugin "fused_bias_act.cu": Preprocessing... Loading... Done.
Setting up TensorFlow plugin "upfirdn_2d.cu": Preprocessing... Loading... Done.
truncation=None
Traceback (most recent call last):
File "run_metrics.py", line 80, in
main()
File "run_metrics.py", line 75, in main
dnnlib.submit_run(sc, 'run_metrics.run', **kwargs)
File "/workspace/dnnlib/submission/submit.py", line 343, in submit_run
return farm.submit(submit_config, host_run_dir)
File "/workspace/dnnlib/submission/internal/local.py", line 22, in submit
return run_wrapper(submit_config)
File "/workspace/dnnlib/submission/submit.py", line 280, in run_wrapper
run_func_obj(**submit_config.run_func_kwargs)
File "/workspace/run_metrics.py", line 30, in run
num_gpus=num_gpus, num_repeats=num_repeats, resume_with_new_nets=resume_with_new_nets, truncations=truncations)
File "/workspace/metrics/metric_base.py", line 188, in run
metric.run(*args, **kwargs)
File "/workspace/metrics/metric_base.py", line 82, in run
self._evaluate(Gs, Gs_kwargs=Gs_kwargs, num_gpus=num_gpus)
File "/workspace/metrics/inception_discriminative_score.py", line 35, in _evaluate
self._configure(self.minibatch_per_gpu, hole_range=self.hole_range)
File "/workspace/metrics/metric_base.py", line 168, in _configure
return self._get_dataset_obj().configure(minibatch_size, hole_range=hole_range)
File "/workspace/metrics/metric_base.py", line 153, in _get_dataset_obj
self._dataset_obj = dataset.load_dataset(data_dir=self._data_dir, **self._dataset_args)
File "/workspace/training/dataset.py", line 250, in load_dataset
dataset = dnnlib.util.get_obj_by_name(class_name)(**kwargs)
File "/workspace/training/dataset.py", line 87, in init
self.resolution = resolution if resolution is not None else max_shape[1]
IndexError: list index (1) out of range

When I debug the code, I realized that tfr_shapes=[[74989]] in training/dataset.py line 82. Here is the "features" dictionary without bytes_list:

{'num_val_images': int64_list {
value: 36500
}
, 'shape': int64_list {
value: 74989
}
, 'compressed': int64_list {
value: 1
}
}

I am using the provided Docker image.

a TypeError

When I tried :
python run_generator.py -c checkpoint/co-mod-gan-ffhq-9-025000.pkl -i imgs/example_image.jpg -m imgs/example_mask.jpg -o imgs/example_output.jpg
a TypeError occurred:init() got an unexpected keyword argument 'auxiliary_name_scope'
and the image resolution is 512*512

How does the image reconstruction work?

First of all congrats on your results, your network is state of the art for image impainting by a margin...
Im currently trying to implement this in PyTorch and I am stuck because my TensorFlow 1 knowledge is limited.

As far as I understood it, your implementation is based on StyleGAN 2, however instead of just using a Generator with a mapping network to calculate the style vectors, you use a mapping network and a encoder network to calculate the style vectors, where the style vector is calculated by applying a linear layer onto the concatenation of the output of the encoder - E(y) and the Mapping network M(z). Here y is the image with holes which we want to reconstruct and z is noise. The architecture of the Encoder is really similar to that of the discriminator in StyleGAN 2.

However does this really suffice to reconstruct the image? Each image in the batch gets mapped by the encoder into a 512/1024 dimensional vector, and this is the only information the network gets in order to impaint the image. Im feeling like I am missing a key piece here...

Thanks for any help in advance!

U-IDS and P-IDS on small dataset

Hi!

I'm wondering, what is the most correct way to evaluate models with U-IDS and P-IDS on small datasets?

The output dimension of InceptionV3 is 2048, which means that for smaller datasets (< 1024 images) the SVM problem is ill-posed and in absense of regularization term each run may lead to a different solution.

Should we SVD the Inception features for all the data before feeding to SVM?

Or the better option is to train SVM multiple times for same set of predictions and then take the mean value?

The training process was killed

Hi,

Thank you for your great job. However, when I using co-mod-gan to train my own dataset, the process was killed after the first loop. My device contain 1 v100 gpus (which has 28GB gpu memory) and 4 cpu (which has 20GB cpu memory for each one). And I set the batch size to 1:
sched.minibatch_size_base = 1
sched.minibatch_gpu_base = 1

image

And when I using dataset_tools/create_from_images.py with --shuffle to convert raw images into TFRecords, it stoped at processing the last image. But I still use it for training. I use the following command to create data:

python /ProjectRoot/python_workspace/image_inpaint/co-mod-gan/dataset_tools/create_from_images.py --train-image-dir="/GlobalData/ums/image_inpaint/co-mod-gan/train_data" --val-image-dir="/GlobalData/ums/image_inpaint/co-mod-gan/test_data" --tfrecord-dir="/ProjectRoot/python_workspace/image_inpaint/co-mod-gan/tfrecord/food" --shuffle

image

I wonder why the process was killed? Something wrong with my dataset? Or my device gpu/cpu memory is not enough

Best regards

question about running the run_training.py

Hi, thanks for sharing. When I run the run_training.py, it generates an error.
dnnlib: Running training.training_loop.training_loop() on localhost...
Streaming data using training.dataset.TFRecordDataset...
Traceback (most recent call last):
File "run_training.py", line 150, in
main()
File "run_training.py", line 145, in main
run(**vars(args))
File "run_training.py", line 71, in run
dnnlib.submit_run(**kwargs)
AssertionError
Please help me, thanks so much.

Prediction on Image and mask without UI

I have a binary mask and image to be filled and want to predict output, how do I do that using this repository. Like in deep fill v1/v2 we can predict it directly. Thank you.
Results are pretty awesome on comodgan.ml

The COCO-Stuff (labels to photos) checkpoints

Thank you for sharing the pretrained models. In the current readme.me, it seems that the COCO-Stuff (labels to photos) checkpoint is linked to the Place2 inpainting checkpoints. Wonder if the readme can be updated. Thanks!

Question about custom dataset preparing, tensorflow.python.framework.errors_impl.OutOfRangeError: 2 root error(s) found.

Hello, thanks for your great job. I have encountered a problem when trying to do training with my own dataset.
I created tfrecord with my own dataset(with jpg files only). I run python scripts as indicated but when I run the training code, there is such an error:

python3 run_training.py --data-dir ./dataset --dataset custom_3 --num-gpus 1 --metrics=ids36k5 --total-kimg 5000
Local submit - run_dir: results/00022-co-mod-gan-custom_3-1gpu
dnnlib: Running training.training_loop.training_loop() on localhost...
Streaming data using training.dataset.TFRecordDataset...
tfrecord_dir: dataset/custom_3
max_shape: [3, 256, 256]
Dataset shape = [3, 256, 256]
Dynamic range = [0, 255]
Label size = 0

Building TensorFlow graph...
Initializing logs...
Training for 50000 kimg...

Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
return fn(*args)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
target_list, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.OutOfRangeError: 2 root error(s) found.
(0) Out of range: End of sequence
[[{{node GPU0/DataFetch/IteratorGetNext}}]]
(1) Out of range: End of sequence
[[{{node GPU0/DataFetch/IteratorGetNext}}]]
[[GPU0/DataFetch/IteratorGetNext/_2837]]
0 successful operations.
0 derived errors ignored.

Can anyone give a hint?

Current state of Pix2Pix?

Amazing work on this project!

I see quite a few references to Pix2Pix in the code and wonder how much has been implemented so far?

Out of curiosity I wanted to add the ability to load pre-generated masks of objects, such as glasses, to see the impact on the results, but it seems a lot of this logic has been built already.

Issue when running run_generator.py (input depth must be evenly divisible by filter depth: 6 vs 4)

Constant issue about (input depth must be evenly divisible by filter depth: 6 vs 4)

`2021-10-13 12:57:27.644988: W tensorflow/core/framework/op_kernel.cc:1692] OP_REQUIRES failed at conv_ops.cc:654 : Invalid argument: input depth must be evenly divisible by filter depth: 6 vs 4
Traceback (most recent call last):
File "/home/sulugodu/.virtualenvs/ma_gan/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1375, in _do_call
return fn(*args)
File "/home/sulugodu/.virtualenvs/ma_gan/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1359, in _run_fn
return self._call_tf_sessionrun(options, feed_dict, fetch_list,
File "/home/sulugodu/.virtualenvs/ma_gan/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1451, in _call_tf_sessionrun
return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: input depth must be evenly divisible by filter depth: 6 vs 4
[[{{node Gs/_Run/Gs/G_synthesis/E_512x512/FromRGB/Conv2D}}]]
[[Gs/_Run/Gs/images_out/_1587]]
(1) Invalid argument: input depth must be evenly divisible by filter depth: 6 vs 4
[[{{node Gs/_Run/Gs/G_synthesis/E_512x512/FromRGB/Conv2D}}]]
0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "run_generator.py", line 61, in
apply_predefined()
File "run_generator.py", line 55, in apply_predefined
create_from_images(pkl_path, img, mask)
File "run_generator.py", line 31, in create_from_images
fake = Gs.run(latent, None, real[np.newaxis], mask[np.newaxis], truncation_psi=truncation)[0]
File "/home/sulugodu/ma_gan/MA/Anonymization/co-mod-gan/dnnlib/tflib/network.py", line 445, in run
mb_out = tf.compat.v1.get_default_session().run(out_expr, dict(zip(in_expr, mb_in)))
File "/home/sulugodu/.virtualenvs/ma_gan/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 967, in run
result = self._run(None, fetches, feed_dict, options_ptr,
File "/home/sulugodu/.virtualenvs/ma_gan/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1190, in _run
results = self._do_run(handle, final_targets, final_fetches,
File "/home/sulugodu/.virtualenvs/ma_gan/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1368, in _do_run
return self._do_call(_run_fn, feeds, fetches, targets, options,
File "/home/sulugodu/.virtualenvs/ma_gan/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1394, in _do_call
raise type(e)(node_def, op, message) # pylint: disable=no-value-for-parameter
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: input depth must be evenly divisible by filter depth: 6 vs 4
[[node Gs/_Run/Gs/G_synthesis/E_512x512/FromRGB/Conv2D (defined at :60) ]]
[[Gs/_Run/Gs/images_out/_1587]]
(1) Invalid argument: input depth must be evenly divisible by filter depth: 6 vs 4
[[node Gs/_Run/Gs/G_synthesis/E_512x512/FromRGB/Conv2D (defined at :60) ]]
0 successful operations.
0 derived errors ignored.

Errors may have originated from an input operation.
Input Source operations connected to node Gs/_Run/Gs/G_synthesis/E_512x512/FromRGB/Conv2D:
Gs/_Run/Gs/G_synthesis/concat (defined at :387)
Gs/_Run/Gs/G_synthesis/E_512x512/FromRGB/mul (defined at :36)

Input Source operations connected to node Gs/_Run/Gs/G_synthesis/E_512x512/FromRGB/Conv2D:
Gs/_Run/Gs/G_synthesis/concat (defined at :387)
Gs/_Run/Gs/G_synthesis/E_512x512/FromRGB/mul (defined at :36)

Original stack trace for 'Gs/_Run/Gs/G_synthesis/E_512x512/FromRGB/Conv2D':
File "run_generator.py", line 61, in
apply_predefined()
File "run_generator.py", line 55, in apply_predefined
create_from_images(pkl_path, img, mask)
File "run_generator.py", line 31, in create_from_images
fake = Gs.run(latent, None, real[np.newaxis], mask[np.newaxis], truncation_psi=truncation)[0]
File "/home/sulugodu/ma_gan/MA/Anonymization/co-mod-gan/dnnlib/tflib/network.py", line 420, in run
out_gpu = net_gpu.get_output_for(*in_gpu, return_as_list=True, **dynamic_kwargs)
File "/home/sulugodu/ma_gan/MA/Anonymization/co-mod-gan/dnnlib/tflib/network.py", line 224, in get_output_for
out_expr = self._build_func(*final_inputs, **build_kwargs)
File "", line 239, in G_main
File "/home/sulugodu/ma_gan/MA/Anonymization/co-mod-gan/dnnlib/tflib/network.py", line 224, in get_output_for
out_expr = self._build_func(*final_inputs, **build_kwargs)
File "", line 391, in G_synthesis_RegionGAN
File "", line 350, in E_fromrgb
File "", line 60, in conv2d_layer
File "/home/sulugodu/.virtualenvs/ma_gan/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py", line 206, in wrapper
return target(*args, **kwargs)
File "/home/sulugodu/.virtualenvs/ma_gan/lib/python3.8/site-packages/tensorflow/python/ops/nn_ops.py", line 2281, in conv2d_v2
return conv2d(input, # pylint: disable=redefined-builtin
File "/home/sulugodu/.virtualenvs/ma_gan/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py", line 206, in wrapper
return target(*args, **kwargs)
File "/home/sulugodu/.virtualenvs/ma_gan/lib/python3.8/site-packages/tensorflow/python/ops/nn_ops.py", line 2388, in conv2d
return gen_nn_ops.conv2d(
File "/home/sulugodu/.virtualenvs/ma_gan/lib/python3.8/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 969, in conv2d
_, _, _op, _outputs = _op_def_library._apply_op_helper(
File "/home/sulugodu/.virtualenvs/ma_gan/lib/python3.8/site-packages/tensorflow/python/framework/op_def_library.py", line 748, in _apply_op_helper
op = g._create_op_internal(op_type_name, inputs, dtypes=None,
File "/home/sulugodu/.virtualenvs/ma_gan/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 3561, in _create_op_internal
ret = Operation(
File "/home/sulugodu/.virtualenvs/ma_gan/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 2045, in init
self._traceback = tf_stack.extract_stack_for_node(self._c_op) `

training failed

i can't figure out why is this happening when i try to fine tune on different dataset training takes 2 to 3 mins and it stops generating a pkl file and the netowrok doesn't seem to be trained
issue

About quantitative metric?

Hi, I have some doubt about the calculation of numerical metric.

  1. Your model(co-mod) is trained on 512512 resolution image, and DeepFillv2(offical) is trained on 256256 resolution. So in this different resolution situation, how do you calculate the metric(like fid)?
    a.the deepfillv2(offical) receive 256256 masked image, and you resize the model's output image to 512512?
    b.the deepfillv2(offical) receive 512*512 masked image?
    c.other method?
  2. DeepFillv2(retrained) model's input image resolution, 256256 or 512512?
  3. Figure 18 in your paper displays some 10241024 resolution images. Do you inference 10241024 resolution image directly on co-model which is trained on 512*512 setting?

The training process always be killed

Hi,

Thank you for your great job, It's amazing. However, when I using co-mod-gan to train ffhq dataset by myself, the process always be killed, my device contain 4 1080ti gpus and each one has 12GB gpu memory, Ram memory is 32GB.

When I use resolution 512x512 ffhq tfrecord dataset to train the model, it shows killed. Could you tell me how much memory do you use? And what should I do? Thank you so much.

Best regards

Creating custom datasets throw ValueError: axes don't match array

I was trying to create custom dataset from dataset_tools/create_from_images.py but even after ensuring all images are of same size (512,512,3) it throws the error:

Processes created.
WARNING:tensorflow:From /content/co-mod-gan/dataset_tools/tfrecord_utils.py:36: The name tf.python_io.TFRecordOptions is deprecated. Please use tf.io.TFRecordOptions instead.

WARNING:tensorflow:From /content/co-mod-gan/dataset_tools/tfrecord_utils.py:36: The name tf.python_io.TFRecordCompressionType is deprecated. Please use tf.compat.v1.python_io.TFRecordCompressionType instead.

WARNING:tensorflow:From /content/co-mod-gan/dataset_tools/tfrecord_utils.py:38: The name tf.python_io.TFRecordWriter is deprecated. Please use tf.io.TFRecordWriter instead.

Processing training images...
./data/training
100% 5541/5541 [00:00<00:00, 76086.06it/s]
Process Process-3:
Traceback (most recent call last):
  File "/usr/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "dataset_tools/create_from_images.py", line 30, in worker
    img = np.asarray(img).transpose([2, 0, 1])
ValueError: axes don't match array
Process Process-4:
Traceback (most recent call last):
  File "/usr/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "dataset_tools/create_from_images.py", line 30, in worker
    img = np.asarray(img).transpose([2, 0, 1])
ValueError: axes don't match array
  0% 0/5541 [00:00<?, ?it/s]Process Process-1:
Traceback (most recent call last):
  File "/usr/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "dataset_tools/create_from_images.py", line 30, in worker
    img = np.asarray(img).transpose([2, 0, 1])
ValueError: axes don't match array
Process Process-2:
Traceback (most recent call last):
  File "/usr/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "dataset_tools/create_from_images.py", line 30, in worker
    img = np.asarray(img).transpose([2, 0, 1])
ValueError: axes don't match array
Process Process-5:
Process Process-7:
Process Process-8:
Traceback (most recent call last):
Traceback (most recent call last):
  File "/usr/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "dataset_tools/create_from_images.py", line 30, in worker
    img = np.asarray(img).transpose([2, 0, 1])
ValueError: axes don't match array
Traceback (most recent call last):
  File "/usr/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "dataset_tools/create_from_images.py", line 30, in worker
    img = np.asarray(img).transpose([2, 0, 1])
ValueError: axes don't match array
Process Process-6:
  File "/usr/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "dataset_tools/create_from_images.py", line 30, in worker
    img = np.asarray(img).transpose([2, 0, 1])
ValueError: axes don't match array
Traceback (most recent call last):
  File "/usr/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "dataset_tools/create_from_images.py", line 30, in worker
    img = np.asarray(img).transpose([2, 0, 1])
ValueError: axes don't match array

Error when running in Colab

Does anyone know how to fix this?

!gdown -q --id 1M2dSxlJnCFNM6LblpB2nQCnaimgwaaKu
!git clone https://github.com/zsyzzsoft/co-mod-gan.git &> /dev/null
%tensorflow_version 1.x
%cd co-mod-gan

!python run_generator.py -c /content/co-mod-gan-ffhq-10-025000.pkl -i imgs/example_image.jpg -m imgs/example_mask.jpg -o imgs/example_output.jpg
Setting up TensorFlow plugin "fused_bias_act.cu": Preprocessing... Compiling... Loading... Done.
Setting up TensorFlow plugin "upfirdn_2d.cu": Preprocessing... Compiling... Loading... Done.
Traceback (most recent call last):
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/client/session.py", line 1365, in _do_call
    return fn(*args)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/client/session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
  (0) Invalid argument: Input to reshape is a tensor with 8405000 values, but the requested shape requires a multiple of 33620000
	 [[{{node Gs/_Run/Gs/G_synthesis/E_1024x1024/Conv1_down/Reshape_1}}]]
	 [[Gs/_Run/Gs/images_out/_1767]]
  (1) Invalid argument: Input to reshape is a tensor with 8405000 values, but the requested shape requires a multiple of 33620000
	 [[{{node Gs/_Run/Gs/G_synthesis/E_1024x1024/Conv1_down/Reshape_1}}]]
0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "run_generator.py", line 34, in <module>
    main()
  File "run_generator.py", line 31, in main
    generate(**vars(args))
  File "run_generator.py", line 16, in generate
    fake = Gs.run(latent, None, real[np.newaxis], mask[np.newaxis], truncation_psi=truncation)[0]
  File "/content/co-mod-gan/dnnlib/tflib/network.py", line 442, in run
    mb_out = tf.get_default_session().run(out_expr, dict(zip(in_expr, mb_in)))
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/client/session.py", line 956, in run
    run_metadata_ptr)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/client/session.py", line 1180, in _run
    feed_dict_tensor, options, run_metadata)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/client/session.py", line 1359, in _do_run
    run_metadata)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/client/session.py", line 1384, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
  (0) Invalid argument: Input to reshape is a tensor with 8405000 values, but the requested shape requires a multiple of 33620000
	 [[node Gs/_Run/Gs/G_synthesis/E_1024x1024/Conv1_down/Reshape_1 (defined at /tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/ops.py:1748) ]]
	 [[Gs/_Run/Gs/images_out/_1767]]
  (1) Invalid argument: Input to reshape is a tensor with 8405000 values, but the requested shape requires a multiple of 33620000
	 [[node Gs/_Run/Gs/G_synthesis/E_1024x1024/Conv1_down/Reshape_1 (defined at /tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/ops.py:1748) ]]
0 successful operations.
0 derived errors ignored.

Original stack trace for 'Gs/_Run/Gs/G_synthesis/E_1024x1024/Conv1_down/Reshape_1':
  File "run_generator.py", line 34, in <module>
    main()
  File "run_generator.py", line 31, in main
    generate(**vars(args))
  File "run_generator.py", line 16, in generate
    fake = Gs.run(latent, None, real[np.newaxis], mask[np.newaxis], truncation_psi=truncation)[0]
  File "/content/co-mod-gan/dnnlib/tflib/network.py", line 417, in run
    out_gpu = net_gpu.get_output_for(*in_gpu, return_as_list=True, **dynamic_kwargs)
  File "/content/co-mod-gan/dnnlib/tflib/network.py", line 221, in get_output_for
    out_expr = self._build_func(*final_inputs, **build_kwargs)
  File "<string>", line 241, in G_main
  File "/content/co-mod-gan/dnnlib/tflib/network.py", line 221, in get_output_for
    out_expr = self._build_func(*final_inputs, **build_kwargs)
  File "<string>", line 384, in G_synthesis_RegionGAN
  File "<string>", line 355, in E_block
  File "<string>", line 58, in conv2d_layer
  File "/content/co-mod-gan/dnnlib/tflib/ops/upfirdn_2d.py", line 333, in conv_downsample_2d
    x = _simple_upfirdn_2d(x, k, pad0=(p+1)//2, pad1=p//2, data_format=data_format, impl=impl)
  File "/content/co-mod-gan/dnnlib/tflib/ops/upfirdn_2d.py", line 363, in _simple_upfirdn_2d
    y = tf.reshape(y, [-1, _shape(x, 1), _shape(y, 1), _shape(y, 2)])
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/ops/array_ops.py", line 131, in reshape
    result = gen_array_ops.reshape(tensor, shape, name)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/ops/gen_array_ops.py", line 8115, in reshape
    "Reshape", tensor=tensor, shape=shape, name=name)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
    op_def=op_def)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/ops.py", line 3357, in create_op
    attrs, op_def, compute_device)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
    op_def=op_def)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/ops.py", line 1748, in __init__
    self._traceback = tf_stack.extract_stack()

Issue when running create_from_images.py

Hi all, I attempted to train a custom dataset and run create_from_images.py.

And I met with the error when running create_from_images.py, here is the code:

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('--C:/Users/xxxx/PycharmProjects/co-mod-gan/generated_TFRecord', help='Output directory of generated TFRecord', required=True)
    parser.add_argument('--D:/xxxx/co-mod-gan/validation_set', help='Root directory of validation images', default=None)
    parser.add_argument('--D:/xxxx/co-mod-gan/training_set', help='Root directory of training images', default=None)
    parser.add_argument('--resolution', help='Target resolution', type=int, default=512)
    parser.add_argument('--num-channels', help='Number of channels of images', type=int, default=3)
    parser.add_argument('--num-processes', help='Number of parallel processes', type=int, default=8)
    parser.add_argument('--shuffle', default=False, action='store_true')
    parser.add_argument('--compressed', default=False, action='store_true')
    parser.add_argument('--pix2pix', default=False, action='store_true')

    args = parser.parse_args()
    create_from_images(**vars(args))


if __name__ == "__main__":
    main()

It is giving error:

usage: create_from_images.py [-h]
                             --C:\Users\xxxx\PycharmProjects\co-mod-gan\generated_TFRecord
                             C:\USERS\xxxx\PYCHARMPROJECTS\CO_MOD_GAN\GENERATED_TFRECORD
                             [--D:\xxxx\co-mod-gan\validation_set D:\FYP\CO_MOD_GAN\VALIDATION_SET]
                             [--D:\xxxx\co-mod-gan\training_set D:\FYP\CO_MOD_GAN\TRAINING_SET]
                             [--resolution RESOLUTION]
                             [--num-channels NUM_CHANNELS]
                             [--num-processes NUM_PROCESSES] [--shuffle]
                             [--compressed] [--pix2pix]
create_from_images.py: error: the following arguments are required: --C:\Users\xxxx\PycharmProjects\co-mod-gan\generated_TFRecord

And if -- deleted,

    parser.add_argument('C:/Users/xxxx/PycharmProjects/co-mod-gan/generated_TFRecord', help='Output directory of generated TFRecord', required=True)
    parser.add_argument('D:/xxxx/co-mod-gan/validation_set', help='Root directory of validation images', default=None)
    parser.add_argument('D:/xxxx/co-mod-gan/training_set', help='Root directory of training images', default=None)

Another error occurs:

Traceback (most recent call last):
  File "C:/Users/xxxx/PycharmProjects/co-mod-gan/dataset_tools/create_from_images.py", line 101, in <module>
    main()
  File "C:/Users/xxxx/PycharmProjects/co-mod-gan/dataset_tools/create_from_images.py", line 86, in main
    parser.add_argument('C:/Users/keker/PycharmProjects/co-mod-gan/generated_TFRecord', help='Output directory of generated TFRecord', required=True)
  File "C:\ProgramData\Anaconda3\envs\co-mod-gan\lib\argparse.py", line 1320, in add_argument
    kwargs = self._get_positional_kwargs(*args, **kwargs)
  File "C:\ProgramData\Anaconda3\envs\co-mod-gan\lib\argparse.py", line 1432, in _get_positional_kwargs
    raise TypeError(msg)
TypeError: 'required' is an invalid argument for positionals

Would you please help me point what am I doing wrong here? Thank you very much.

I'm new to programming and really appreciate any of your help :)

Why did you use TensorFlow v1 for the implementation ?

Hello,

Great paper and great results !

I would really like to try out your model, but just out of curiosity, I would like to know why did you choose to implement you model using TensorFlow v1. As mentionned in other issues, it makes running your models and loading the pre-trained weights a real pain.

I understand that older models from 2017, 2018 and 2019 relied on TF v1 because it was still the main version back then, but TF v1's development has now stopped and it doesn't support newer versions of CUDA, newer GPUs, and many libraries are now reliant on a tensorflow v2.

Sorry for posting this as an issue. I just hope you can provide a public answer for all future researchers trying to replicate your architecture.

Errors in data preprocessing and training

Hi,

Thank you for making your code public. I used your code and instructions to train on my data but have the following issues. (Note that previously I was able to generate tfrec using the original StyleGAN2's data processing code and train using StyleGAN2's training code without modifying the code. So I am not sure why I am not able to get your code to work with my data, since you use the StyleGAN2 codebase).

  1. I first tried to generate tfrec using the following cmdline:
    python dataset_tools/create_from_images.py --tfrecord-dir <tfrec_dir> --train-image-dir <img_dir> --resolution 1024 --shuffle True --compressed True
    The images are being loaded and the number of images are reported correctly and the tfrec dir is created, but the tfrecords are not being saved (only one tfrec file with 0 bytes is saved).

  2. So I generated tfrec using original stylgan2's data processing code and tried to train using the following cmd line:
    python run_training.py --data-dir=<tfrec_dir> --result-dir=<result_dir> --dataset="" --num-gpus=4 --total-kimg=10000 --mirror-augment=True
    But it runs into OOM error even though I reduced the batch size from 32 to 16.
    The error is : unable to allocate memory for [1,128,1024,1024] tensor.

I was able to train using StyleGAN2's code using batch size 32 and 1024^2 resolution. So I would
appreciate your help.

thanks

[Share]How to run demo and training in a docker container

After miserable experience on running training on this repo. I think I need to write it down to share to others. My situation is: I failed run the model on my own ubuntu server, as I didn't install CUDA10.0. But, running in the docker is also not so easy.

Run the demo

  • Build the docker
docker build -t new-comod-gan .
docker run -itd -v /your_work_dir:/work  -v /your_data_dir:/data --name comod -p 7200-7220:7200-7220 --gpus all new-comod-gan /bin/bash
  • Install extra packages in your container
pip install opencv-python 
pip install tqdm
pip install scikit-learn
  • Prepare your test image dataset
python dataset_tools/create_from_images.py --val-image-dir ./your_test_images_dir --tfrecord-dir ./tfrecords
  • Add cuda to path in your container, you'd better add this to .bashrc
export PATH="/usr/local/nvidia/bin:/usr/local/cuda/bin:$PATH"
  • Now you can run the demo with pretrained model
python run_demo.py -c ./pretrained/co-mod-gan-places2-050000.pkl -d ./tfrecords
  • But...you didn't get the GUI from your remote container? Maybe VcXsrv can help you if you are using windows locally. My solution is:
    • Install VcXsrv in windows
    • Use vscode to access my container, and install Remote X11 plugin in vscode.
    • In vscode settings->Remote, search Host Option, and set your remote Host'IP with your remote server's IP(Not the container's IP)
    • Start VcXsrv first in your local Windows, and then run your remote demo. GUI will be displayed if there is no error.

Run training

  • Preparing your own training dataset
    Here, I prepared some images in ./imgs/png_samples/ for training test.
python dataset_tools/create_from_images.py --train-image-dir ./imgs/png_samples/ --val-image-dir ./imgs/png_samples/ --tfrecord-dir ./train_dataset --resolution 512 --num-channels 3

Note:
1. Only 3 channels can be used. If you're using png files, do not set --num-channels to 4, you'll get error in training.
2. --val-image-dir should be specified, or you'll have error in training.

  • Run your training
python run_training.py --data-dir=./  --dataset=train_dataset --metrics=ids10k --mirror-augment True --num-gpus=4

For researchers in China, you may need a VPN. The training process will download an inception model file. You can:

export https_proxy="https://your_vpn_ip&port"

I have a problem running run_generator.py

tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: slice index 0 of dimension 1 out of bounds.
[[{{node Gs/_Run/Gs/G_synthesis/strided_slice_1}}]]
[[Gs/_Run/Gs/images_out/_1419]]
(1) Invalid argument: slice index 0 of dimension 1 out of bounds.
[[{{node Gs/_Run/Gs/G_synthesis/strided_slice_1}}]]
0 successful operations.
0 derived errors ignored.

Thank you for your code, but it reports an error when running it. How can I solve this? thanks

Parameter tuning and re-implementation with Pytorch

First, thank you for the impressive work! Currently, I am re-implementing a pytorch version of co-mod-gan, and I have several questions regarding the model:

  1. Have you tried different R1 regularization? Empirically, I found that when using a small R1 than 10, the convergence of l1 loss is faster, I wonder if you tried other R1 weights?
  2. Will dropout of the global code improves the performance?
  3. Have you tried adding a skip connection to the encoder?
  4. Also why the style mixing weight is set to 0.5?

Thanks

Getting error when running run_training.py with custom dataset

Hello, thanks for your wonderful work!

I have a question about running run_training.py with a custom dataset.

  1. The dataset has been pre-processed with create_from_images.py, and finally got one TFRecord file. I was wondering if one single file is OK? (2.07G contains 2k images)
  2. The following error was giving when running python run_training.py --data-dir=<>d --result-dir=<> --dataset="train" --num-gpus=1 --total-kimg=10000 --mirror-augment=True
Local submit - run_dir: /content/drive/MyDrive/co-mod-gan/results/00006-co-mod-gan-train_all-1gpu
dnnlib: Running training.training_loop.training_loop() on localhost...
Streaming data using training.dataset.TFRecordDataset...
tcmalloc: large alloc 4294967296 bytes == 0x562d81b88000 @  0x7f9c7abf2001 0x7f9c776d654f 0x7f9c77726b58 0x7f9c7772ab17 0x7f9c777c9203 0x562d79b9c424 0x562d79b9c120 0x562d79c10b80 0x562d79c0b66e 0x562d79b9e36c 0x562d79bdf7b9 0x562d79bdc6d4 0x562d79b9e571 0x562d79c0d633 0x562d79c0b02f 0x562d79adce2b 0x562d79c0d633 0x562d79c0b66e 0x562d79adce2b 0x562d79c0d633 0x562d79b9d9da 0x562d79c0beae 0x562d79b9d9da 0x562d79c0c108 0x562d79c0b02f 0x562d79adce2b 0x562d79c0d633 0x562d79c0b02f 0x562d79adce2b 0x562d79c0d633 0x562d79b9d9da
tcmalloc: large alloc 4294967296 bytes == 0x562e81b88000 @  0x7f9c7abf01e7 0x7f9c776d646e 0x7f9c77726c7b 0x7f9c7772735f 0x7f9c777c9103 0x562d79b9c424 0x562d79b9c120 0x562d79c10b80 0x562d79c0b02f 0x562d79b9daba 0x562d79c0ccd4 0x562d79c0b02f 0x562d79b9daba 0x562d79c0ccd4 0x562d79c0b02f 0x562d79b9daba 0x562d79c0ccd4 0x562d79b9d9da 0x562d79c0beae 0x562d79c0b02f 0x562d79b9daba 0x562d79c102c0 0x562d79c0b02f 0x562d79b9daba 0x562d79c0ccd4 0x562d79c0b66e 0x562d79b9e36c 0x562d79bdf7b9 0x562d79bdc6d4 0x562d79b9e571 0x562d79c0d633
tcmalloc: large alloc 4294967296 bytes == 0x562f834ea000 @  0x7f9c7abf01e7 0x7f9c776d646e 0x7f9c77726c7b 0x7f9c7772735f 0x7f9c22441235 0x7f9c21dc4792 0x7f9c21dc4d42 0x7f9c21d7daee 0x562d79b9c317 0x562d79b9c120 0x562d79c10679 0x562d79b9d9da 0x562d79c0c108 0x562d79c0b1c0 0x562d79adceb0 0x562d79c0d633 0x562d79c0b02f 0x562d79b9daba 0x562d79c0c108 0x562d79c0b66e 0x562d79b9daba 0x562d79c0c108 0x562d79b9d9da 0x562d79c0c108 0x562d79c0b02f 0x562d79b9e151 0x562d79b9e571 0x562d79c0d633 0x562d79c0b02f 0x562d79b9daba 0x562d79c0beae
Dataset shape = [3, 512, 512]
Dynamic range = [0, 255]
Label size    = 0
Traceback (most recent call last):
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/client/session.py", line 1365, in _do_call
    return fn(*args)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/client/session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
    run_metadata)


 tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
	 [[{{node Dataset_1/IteratorGetNext}}]]



During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "run_training.py", line 133, in <module>
    main()
  File "run_training.py", line 128, in main
    run(**vars(args))
  File "run_training.py", line 71, in run
    dnnlib.submit_run(**kwargs)
  File "/content/drive/MyDrive/co-mod-gan/dnnlib/submission/submit.py", line 343, in submit_run
    return farm.submit(submit_config, host_run_dir)
  File "/content/drive/MyDrive/co-mod-gan/dnnlib/submission/internal/local.py", line 22, in submit
    return run_wrapper(submit_config)
  File "/content/drive/MyDrive/co-mod-gan/dnnlib/submission/submit.py", line 280, in run_wrapper
    run_func_obj(**submit_config.run_func_kwargs)
  File "/content/drive/MyDrive/co-mod-gan/training/training_loop.py", line 142, in training_loop
    grid_size, grid_reals, grid_labels, grid_masks = misc.setup_snapshot_image_grid(training_set, **grid_args)
  File "/content/drive/MyDrive/co-mod-gan/training/misc.py", line 123, in setup_snapshot_image_grid
    reals[:], labels[:] = training_set.get_minibatch_val_np(gw * gh)
  File "/content/drive/MyDrive/co-mod-gan/training/dataset.py", line 189, in get_minibatch_val_np
    return tflib.run(self._tf_minibatch_val_np)
  File "/content/drive/MyDrive/co-mod-gan/dnnlib/tflib/tfutil.py", line 31, in run
    return tf.get_default_session().run(*args, **kwargs)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/client/session.py", line 956, in run
    run_metadata_ptr)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/client/session.py", line 1180, in _run
    feed_dict_tensor, options, run_metadata)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/client/session.py", line 1359, in _do_run
    run_metadata)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/client/session.py", line 1384, in _do_call
    raise type(e)(node_def, op, message)


tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
	 [[node Dataset_1/IteratorGetNext (defined at /tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/ops.py:1748) ]]


Original stack trace for 'Dataset_1/IteratorGetNext':
  File "run_training.py", line 133, in <module>
    main()
  File "run_training.py", line 128, in main
    run(**vars(args))
  File "run_training.py", line 71, in run
    dnnlib.submit_run(**kwargs)
  File "/content/drive/MyDrive/co-mod-gan/dnnlib/submission/submit.py", line 343, in submit_run
    return farm.submit(submit_config, host_run_dir)
  File "/content/drive/MyDrive/co-mod-gan/dnnlib/submission/internal/local.py", line 22, in submit
    return run_wrapper(submit_config)
  File "/content/drive/MyDrive/co-mod-gan/dnnlib/submission/submit.py", line 280, in run_wrapper
    run_func_obj(**submit_config.run_func_kwargs)
  File "/content/drive/MyDrive/co-mod-gan/training/training_loop.py", line 142, in training_loop
    grid_size, grid_reals, grid_labels, grid_masks = misc.setup_snapshot_image_grid(training_set, **grid_args)
  File "/content/drive/MyDrive/co-mod-gan/training/misc.py", line 123, in setup_snapshot_image_grid
    reals[:], labels[:] = training_set.get_minibatch_val_np(gw * gh)
  File "/content/drive/MyDrive/co-mod-gan/training/dataset.py", line 188, in get_minibatch_val_np
    self._tf_minibatch_val_np = self.get_minibatch_val_tf()
  File "/content/drive/MyDrive/co-mod-gan/training/dataset.py", line 174, in get_minibatch_val_tf
    return self._tf_val_iterator.get_next()
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/data/ops/iterator_ops.py", line 426, in get_next
    name=name)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/ops/gen_dataset_ops.py", line 2518, in iterator_get_next
    output_shapes=output_shapes, name=name)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
    op_def=op_def)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/ops.py", line 3357, in create_op
    attrs, op_def, compute_device)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
    op_def=op_def)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/ops.py", line 1748, in __init__
    self._traceback = tf_stack.extract_stack()

Is it related to the TensorFlow version? I am trying to implement the training session with Google Colab, it only provides Tensorflow 1.15.2 now...

Could you please help me finger what I did wrong?

Thanks for any of your help and happy CNY :)

The time used to test a image

This is a really great job! However, when I run the run_generator.py script, It takes more than 20s to test a image. Is that normal?

Issue with run_generator.py

I have trained my own dataset by using the dataset_tools/create_from_images.py and when running the run_generator.py, I get the issue

when using fake = Gs.run(latent, None, real, mask, truncation_psi=truncation)[0]
Error:
return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict, tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: input must be 4-dimensional[1,512,256] [[{{node Gs/_Run/Gs/G_synthesis/E_256x256/FromRGB/Conv2D}}]] [[Gs/_Run/Gs/images_out/_1407]] (1) Invalid argument: input must be 4-dimensional[1,512,256] [[{{node Gs/_Run/Gs/G_synthesis/E_256x256/FromRGB/Conv2D}}]] 0 successful operations. 0 derived errors ignored.

when using fake = Gs.run(latent, None, real[np.newaxis], mask, truncation_psi=truncation)[0]
Error:
return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict, tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: ConcatOp : Ranks of all input tensors should match: shape[0] = [1,256,256] vs. shape[1] = [1,3,256,256] [[{{node Gs/_Run/Gs/G_synthesis/concat}}]] [[Gs/_Run/Gs/images_out/_1407]] (1) Invalid argument: ConcatOp : Ranks of all input tensors should match: shape[0] = [1,256,256] vs. shape[1] = [1,3,256,256] [[{{node Gs/_Run/Gs/G_synthesis/concat}}]] 0 successful operations. 0 derived errors ignored.

Details:
Real image shape : [3,256,256]
Mask shape : [1,256,256]
Dynamic range applied on real image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.