hubert0527 / coco-gan Goto Github PK
View Code? Open in Web Editor NEWCOCO-GAN: Generation by Parts via Conditional Coordinating (ICCV 2019 oral)
Home Page: https://hubert0527.github.io/COCO-GAN/
License: MIT License
COCO-GAN: Generation by Parts via Conditional Coordinating (ICCV 2019 oral)
Home Page: https://hubert0527.github.io/COCO-GAN/
License: MIT License
How long did you train your models and on what hardware?
I am training on the CelebA dataset, size 64x64 and config N2M2S32. With batch size 64, it takes a bit less than 24h for 50 epochs on a TITAN Xp.
I'm having OOM issues trying to run a custom dataset of 256x256 resolution. Apart from creating a YAML config file for my dataset does anything else need to implemented for this to work?
Hello- would it be possible to provide a sample script or perhaps some guidance on how one would sample from one of the pre-trained models?
Hi,
I also try to test your model on my costume dataset.
One question about your testing procedure:
Do you split the CelebA( or CelebA-64*64) dataset for training and testing, and precalculate the fid-stat on the testing part?
Or do you pre-calculate the fid-stat on the whole dataset, train the model using the whole dataset, and compare the statistics of fake samples with the pre-calculated fid-stat?
For training a 256x256 model, what is the size of the network in terms of number of trainable parameters?
What is the best easiest way to generate X amount of example images from a trained model?
Hi, @hubert0527 , I change the training set of panorama and don't change other setting. The loss become very large as show in the figure. If you have any idea about that? Besides, if you have any plan to provide the pre-trained model on Matterport3D? Looking forward to your response.
I'm confused about the coordination computation. Could you explain it more clearly? Thank you.
We have not been able to reproduce the results given the code in this repository. Here is what we have tried.
We loaded the provided pre-trained weights for the models “CelebA_128x128_N2M2S64” and “CelebA_64x64_N2M2S32” and ran inference. However, the generated images do not look as good as the ones in the paper and the calculate FID is several orders of magnitude higher than expected (around 300). The only modification we made to the code was to replace the scipy.misc.imread with imageio.imread.
In addition, we re-trained the 64x64 model with the configuration you provided (“configs/CelebA_64x64_N2M2S32.yaml”). We had to change the variable basic_layers on line 51 in model/discriminator.py from [2, 4, 8, 8] to [1, 4, 8, 8] to match the pre-trained weight dimensions.
This experiment also yielded high FID and non-realistic images.
How did you train the provided weights? Did you use the private codebase? What might be a reason why we cannot reproduce the results?
Thank you.
Hi, Thanks for your novel work, I have some questions.
In the Table 1, the celeba 128 results are produced based on aligned images or cropped images? Would you ming providing the fid_stat file for celeba 128 aligned images? That would be so helpful to me.
Thanks so much!
Hi, I'm studying the code of your COCO GAN paper, and have a small problem which I hope you can help me with: in the appendix of the paper, it says that inorder to make c' and c'' have similar scale with the latent code, their values are normalized to [-1,1]. But based on my understanding, in the _euclidean_sample_coord function of coord_handler.py, the returned d_macro_coord are within [-1,1], but the g_micro_coord are original pixel positions (because their computation in line 131, 132 all multiplied the gpc_x,gpc_y, which scaled them to original image scale, and I didn't see any other normalization step before it is concatenated with the latent code in the generator, could you help me point out where I got it wrong? Thanks
I'm just trying to run all the preprocessing/training on the CelebA dataset, and so far I'm missing several packages besides just tensorflow. Some of the utilities with scipy
have been deprecated at this point in time, so I've had to go through and find the correct versions of certain packages including that one to get things working. Would greatly appreciate a requirements.txt or conda environment.yaml file. Thanks!
How many epochs for LSUN bedroom dataset to get reported FID? Thanks!
Using python 3.6.9 and packages as advised in README, I tried to train from scratch with:
python ./scripts/compute_tfrecord.py --dataset celeba --resolution 64
python ./fid_utils/precalc_fid_stats.py --dataset celeba --data_path "./data/CelebA/*" --resolution=64
python main.py --config="./configs/CelebA_64x64_N2M2S32.yaml"
I didn't set up the number of epochs as advised in the yaml.
The training started and continued for quite some time, the tensorboard looks fine I think?
However, after exactly 130000 steps there was an error, the traceback of which I post below. Any idea why this happened?
[CelebA_64x64_N2M2S32] [Epoch: 54; 1261/2384; global_step:129997] elapsed: 64138.1083, d: -1490.8943, g: -139648.8750, q: 0.0000
[CelebA_64x64_N2M2S32] [Epoch: 54; 1262/2384; global_step:129998] elapsed: 64138.4170, d: 151.7455, g: -136700.0781, q: 0.0000
[CelebA_64x64_N2M2S32] [Epoch: 54; 1263/2384; global_step:129999] elapsed: 64138.7237, d: 741.5792, g: -140252.2188, q: 0.0000
[CelebA_64x64_N2M2S32] [Epoch: 54; 1264/2384; global_step:130000] elapsed: 64139.0329, d: -1249.9255, g: -138572.9062, q: 0.0000
62%|█████████████████████████████████████████████████████████████████▉ | 482/782 [09:07<05:41, 1.14s/it]2020-07-27 08:41:02.094112: E tensorflow/core/kernels/check_numerics_op.cc:185] abnormal_detected_host @0x7f19ed960100 = {1, 0} activation input is not finite.
62%|█████████████████████████████████████████████████████████████████▉ | 482/782 [09:08<05:41, 1.14s/it]
2020-07-27 08:41:02.107534: W tensorflow/core/kernels/queue_base.cc:277] _1_shuffle_batch/random_shuffle_queue: Skipping cancelled enqueue attempt with queue not closed
2020-07-27 08:41:02.107596: W tensorflow/core/kernels/queue_base.cc:277] _1_shuffle_batch/random_shuffle_queue: Skipping cancelled enqueue attempt with queue not closed
2020-07-27 08:41:02.107654: W tensorflow/core/kernels/queue_base.cc:277] _1_shuffle_batch/random_shuffle_queue: Skipping cancelled enqueue attempt with queue not closed
2020-07-27 08:41:02.107891: W tensorflow/core/kernels/queue_base.cc:277] _0_input_producer: Skipping cancelled enqueue attempt with queue not closed
2020-07-27 08:41:02.108087: W tensorflow/core/kernels/queue_base.cc:277] _1_shuffle_batch/random_shuffle_queue: Skipping cancelled enqueue attempt with queue not closed
2020-07-27 08:41:02.108192: W tensorflow/core/kernels/queue_base.cc:277] _1_shuffle_batch/random_shuffle_queue: Skipping cancelled enqueue attempt with queue not closed
2020-07-27 08:41:02.108304: W tensorflow/core/kernels/queue_base.cc:277] _1_shuffle_batch/random_shuffle_queue: Skipping cancelled enqueue attempt with queue not closed
2020-07-27 08:41:02.108340: W tensorflow/core/kernels/queue_base.cc:277] _1_shuffle_batch/random_shuffle_queue: Skipping cancelled enqueue attempt with queue not closed
2020-07-27 08:41:02.108361: W tensorflow/core/kernels/queue_base.cc:277] _1_shuffle_batch/random_shuffle_queue: Skipping cancelled enqueue attempt with queue not closed
2020-07-27 08:41:02.108384: W tensorflow/core/kernels/queue_base.cc:277] _1_shuffle_batch/random_shuffle_queue: Skipping cancelled enqueue attempt with queue not closed
2020-07-27 08:41:02.108420: W tensorflow/core/kernels/queue_base.cc:277] _1_shuffle_batch/random_shuffle_queue: Skipping cancelled enqueue attempt with queue not closed
2020-07-27 08:41:02.108443: W tensorflow/core/kernels/queue_base.cc:277] _1_shuffle_batch/random_shuffle_queue: Skipping cancelled enqueue attempt with queue not closed
2020-07-27 08:41:02.108464: W tensorflow/core/kernels/queue_base.cc:277] _1_shuffle_batch/random_shuffle_queue: Skipping cancelled enqueue attempt with queue not closed
2020-07-27 08:41:02.108487: W tensorflow/core/kernels/queue_base.cc:277] _1_shuffle_batch/random_shuffle_queue: Skipping cancelled enqueue attempt with queue not closed
2020-07-27 08:41:02.108511: W tensorflow/core/kernels/queue_base.cc:277] _1_shuffle_batch/random_shuffle_queue: Skipping cancelled enqueue attempt with queue not closed
2020-07-27 08:41:02.108532: W tensorflow/core/kernels/queue_base.cc:277] _1_shuffle_batch/random_shuffle_queue: Skipping cancelled enqueue attempt with queue not closed
2020-07-27 08:41:02.108555: W tensorflow/core/kernels/queue_base.cc:277] _1_shuffle_batch/random_shuffle_queue: Skipping cancelled enqueue attempt with queue not closed
Traceback (most recent call last):
File "/home/janek/Documents/COCO-GAN/coco_venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/home/janek/Documents/COCO-GAN/coco_venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/janek/Documents/COCO-GAN/coco_venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: activation input is not finite. : Tensor had NaN values
[[{{node FID_Inception_Net/mixed_4/tower/conv_2/CheckNumerics}}]]
[[{{node FID_Inception_Net/pool_3}}]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "main.py", line 131, in <module>
trainer.train(logger, evaluator, global_step)
File "/home/janek/Documents/COCO-GAN/trainer.py", line 419, in train
z_iter, z_fixed, feed_dict_iter, feed_dict_fixed)
File "/home/janek/Documents/COCO-GAN/logger.py", line 207, in log_iter
cur_fid = evaluator.evaluate(trainer)
File "/home/janek/Documents/COCO-GAN/evaluator.py", line 71, in evaluate
batch_features = fid.get_activations(gen_full_images, self.sess, self.batch_size)
File "/home/janek/Documents/COCO-GAN/fid_utils/fid.py", line 125, in get_activations
pred = sess.run(inception_layer, {'FID_Inception_Net/ExpandDims:0': batch})
File "/home/janek/Documents/COCO-GAN/coco_venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/home/janek/Documents/COCO-GAN/coco_venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/home/janek/Documents/COCO-GAN/coco_venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/home/janek/Documents/COCO-GAN/coco_venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: activation input is not finite. : Tensor had NaN values
[[node FID_Inception_Net/mixed_4/tower/conv_2/CheckNumerics (defined at /home/janek/Documents/COCO-GAN/fid_utils/fid.py:45) ]]
[[node FID_Inception_Net/pool_3 (defined at /home/janek/Documents/COCO-GAN/fid_utils/fid.py:45) ]]
Caused by op 'FID_Inception_Net/mixed_4/tower/conv_2/CheckNumerics', defined at:
File "main.py", line 115, in <module>
evaluator.build_graph()
File "/home/janek/Documents/COCO-GAN/evaluator.py", line 45, in build_graph
fid.create_inception_graph(inception_path)
File "/home/janek/Documents/COCO-GAN/fid_utils/fid.py", line 45, in create_inception_graph
_ = tf.import_graph_def( graph_def, name='FID_Inception_Net')
File "/home/janek/Documents/COCO-GAN/coco_venv/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/home/janek/Documents/COCO-GAN/coco_venv/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 442, in import_graph_def
_ProcessNewOps(graph)
File "/home/janek/Documents/COCO-GAN/coco_venv/lib/python3.6/site-packages/tensorflow/python/framework/importer.py", line 235, in _ProcessNewOps
for new_op in graph._add_new_tf_operations(compute_devices=False): # pylint: disable=protected-access
File "/home/janek/Documents/COCO-GAN/coco_venv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3433, in _add_new_tf_operations
for c_op in c_api_util.new_tf_operations(self)
File "/home/janek/Documents/COCO-GAN/coco_venv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3433, in <listcomp>
for c_op in c_api_util.new_tf_operations(self)
File "/home/janek/Documents/COCO-GAN/coco_venv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3325, in _create_op_from_tf_operation
ret = Operation(c_op, self)
File "/home/janek/Documents/COCO-GAN/coco_venv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1801, in __init__
self._traceback = tf_stack.extract_stack()
InvalidArgumentError (see above for traceback): activation input is not finite. : Tensor had NaN values
[[node FID_Inception_Net/mixed_4/tower/conv_2/CheckNumerics (defined at /home/janek/Documents/COCO-GAN/fid_utils/fid.py:45) ]]
[[node FID_Inception_Net/pool_3 (defined at /home/janek/Documents/COCO-GAN/fid_utils/fid.py:45) ]]
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.