minfengzhu / dm-gan Goto Github PK

View Code? Open in Web Editor NEW

184.0 184.0 63.0 100 KB

License: MIT License

Python 99.39% Starlark 0.61%

dm-gan's People

Contributors

Stargazers

Watchers

Forkers

yongsongh ffzhang1231 ammieqi qizhongjian juhongpark jwehrmann tylerwalker dongdongdong666 mathpranay matrixblake nickdrhodes tsaiyali alchematt jiujiuliu699 lukun199 nihirv donny-hikari lossherl jessiemino leoxing1996 qwfy nirmal250 azadis amritds zidaoziyan123 smallflyingpig geetika016 kuailefeifei zq-hznu codehorse-max seitaroshinagawa apokar aashiqmuhamed chaoso cuichenx wangrongzhao maxylee wintersurvival nakachi-s shivamguptahat constantin771 creling hongbo-sun jeonghopark07 dami23 ryukijano statjuns fortunechen callmexiaowei lomidez terrisgo fm5o1 tsunamiblue bzy-bzy ttliu-kiwi yzy-0001 guoqi0531 hungry-98 smearle lvsi-qi adasx nancy6o6 zujiasheng

dm-gan's Issues

where to find output images of !python2 main.py --cfg cfg/eval_bird.yml --gpu 0

IS for COCO报错？

请问使用IS for COCO评估生成效果的时候，运行_init_inception()函数就会在 91行读取shape的时候报错，就像这样：

debug的时候查看这部分的shape有的是（），或者（None，None，3），但这是加载的'http://download.tensorflow.org/models/image/imagenet/inception-2015-12-05.tgz'，请问要怎么处理这个问题呢？

About the metric IS

I have loaded your code and I didnt revise any, however, I cannt get the right IS score but I can get the right FID score.
So,where my wrong friends(I got the IS only 1.19)?I need your help!

error when using captions from text file for validation

Hi,

I'm trying to load captions from txt files during validation (i.e. the if block on line 223 in datasets.py).
However it gives me the following error:
RuntimeError: Error(s) in loading state_dict for RNN_ENCODER: size mismatch for encoder.weight: copying a param of torch.Size([26835, 300]) from checkpoint, where the shape is torch.Size([27297, 300]) in current model.

This has also been reported in the AttnGAN issues: taoxugit/AttnGAN#45
Would be great if you could tell me why this error pops up using txt files.

The validation works if I use the default method.

RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 4 and 1 in dimension 2 at /tmp/pip-req-build-0vti0ns4/aten/src/THC/generic/THCTensorMath.cu:71

Traceback (most recent call last):
File "main.py", line 148, in
algo.train()
File "/home/tjl/recurrence/DM-GAN-master/code/trainer.py", line 291, in train
sent_emb, real_labels, fake_labels)
File "/home/tjl/recurrence/DM-GAN-master/code/miscc/losses.py", line 143, in discriminator_loss
cond_real_logits = netD.COND_DNET(real_features, conditions)
File "/home/tjl/anaconda3/envs/t1/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/tjl/recurrence/DM-GAN-master/code/model.py", line 627, in forward
h_c_code = torch.cat((h_code, c_code), 1)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 4 and 1 in dimension 2 at /tmp/pip-req-build-0vti0ns4/aten/src/THC/generic/THCTensorMath.cu:71
由于我的电脑是2080显卡，所以无法使用pytorch0.4.0，使用pytorch1.2.0后报错，希望作者可以解答一下。

DM-GAN for coco or bird

Why is my FID so high using the FID code provided there?

I use 'DM-GAN/eval/FID/find_score.py' to calculate the FID of AttnGAN on COCO dataset. Before that I set the B_VALIDATION = TRUE ,changed the BATCH_SIZE=50, then trained my model 120 epoches to produce evaluation image. I didn't change anywhere in find_score.py . However, I got FID=62.00581731786599！Is it because the coco_val.npz model I used which can't be used on AttnGAN. Has anyone ever been in this situation and solved it? 😔

model performance on coco dataset

great job

请问R precision是怎么实现呢？

Unable to extract CUB_200_2011.tgz to ./data/birds

While trying to extract using tar I am getting the following error

gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now

Model Download Links Do Not Exist

DM-GAN for coco or bird don not exist. Furthermore, FID for bird and coco (bird_val.npz and coco_val.npz) also don not find it! Are these web download links wrong? I would appreciate it if you could verify the relevant links.

Unable to train, the process is getting killed !!!

So I installed all the dependencies and was ready to train the model on birds dataset, but after a few minutes, the training started the process got killed. what do I do? Need help

Problem about training time. 有关训练时间的问题。

Thanks a lot for your work. Is this model trained on single GPU? And how long does it take for training 600 epochs?
感谢你们做的这项工作。我想问的问题是，模型训练是单个GPU吗，训练600个epochs大概需要多长时间。我在12G的GPU上大概训了2天，这个时间是否正常。感谢~

gen_example function

Hi, Thanks for your great work i wanna run gen_example function i make a configuration file for it but when i started i get an error
is there any way you suggest to run it?

thanks in advance

When i try to train the the birds model, I encountered this problem. Coulde you help me solve this problem? RuntimeError: Error(s) in loading state_dict for RNN_ENCODER: size mismatch for encoder.weight: copying a param of torch.Size([1, 300]) from checkpoint, where the shape is torch.Size([5450, 300]) in current model.

When i try to train the the birds model, I encountered this problem.

Coulde you help me solve this problem?

RuntimeError: Error(s) in loading state_dict for RNN_ENCODER:
size mismatch for encoder.weight: copying a param of torch.Size([1, 300]) from checkpoint, where the shape is torch.Size([5450, 300]) in current model.

How to train and create my own model with my own custom dataset of images and their descriptsions

I want to use another dataset instead of the given cub and coco dataset, so how do I do that?

Evaluation Metrics

Hi, there. I'm replicating DM-GAN on CUB but got bad results(R-Precision: 57.07(±0.71); FID: 25.68) after training 700 epochs. Besides, I got even worse results(R-Precision: 35.45(±0.64); FID: 42.57) using the checkpoint(bird_DMGAN.pth) you provided. By way of generation, I generated 30,000 images using the scripts provided. Can you please help me with this issue?
btw, I have no access to the inception model for IS evaluation on google drive, can you please provide another download link?
Thanks!
Here is my config:

CONFIG_NAME: 'DMGAN'

DATASET_NAME: 'birds'
DATA_DIR: '../data/birds'
GPU_ID: 7
WORKERS: 1

B_VALIDATION: True  # True  # False
TREE:
    BRANCH_NUM: 3


TRAIN:
    FLAG: False
    NET_G: '../models/bird_DMGAN.pth'
    B_NET_D: False
    BATCH_SIZE: 10
    NET_E: '../DAMSMencoders/bird/text_encoder200.pth'


GAN:
    DF_DIM: 32
    GF_DIM: 64
    Z_DIM: 100
    R_NUM: 2

TEXT:
    EMBEDDING_DIM: 256
    CAPTIONS_PER_IMAGE: 10
    WORDS_NUM: 25

datasets assert

I try to run the code, but the path of the datasets is wrong. Can you show me you right path？

Evaluation on CUB and COCO dataset ?

Hi author, i have a question that how many epochs you train to obtain IS and FID value with release pretrained model? (800 epochs for CUB and 200 epochs for COCO (in config file) OR 600 epochs for CUB and 120 epochs for COCO (in paper)) ? Many thanks.

CUDA Driver version problem.

When i try to train the the birds model, I encountered this problem.

CUDA driver version is insufficient for CUDA runtime version at torch/csrc/cuda/Module.cpp:32

How do you implement the R precision?Could you share the code?

I tried implementing R precision by myself, but the result I got in attnGAN model, is very low, near 3%.
So how do you implement the R precision?Could you share the code?

Thanks for your help!!

The sequence of resblock and upblock in NextstageG

Dear researcher, I have a question about the sequence of resblock and upblock. In your paper the output of Response Gate go firsly to a upblock then towards two resblocks, but in the code it seems go resblocks frist. How is it going? Does the order matters?

Some Q with Real_Acc & Fake_Acc

Hello sir
I don't understand how it (Real_Acc & Fake_Acc) connect with the image quality

assert datasets和维度不匹配问题

在复现过程中，我出现了两个问题：
1、严格按照说明放置模型文件和数据文件，出现了assert datasets错误。
2、出现了维度不匹配问题，不知道是模型位置问题还是模型提供错误。
RuntimeError: Error(s) in loading state_dict for RNN_ENCODER: size mismatch for encoder.weight: copying a param of torch.Size([1, 300]) from checkpoint, where the shape is torch.Size([5450, 300]) in current model.“data”

IS value is Nan

I tried to run the IS code in tf19.12 and got some problem
Here is my setting in inception_score_birds:
tf.flags.DEFINE_string('checkpoint_dir','./model.ckpt', """Path where to read model checkpoints.""") tf.flags.DEFINE_string('image_folder', './bird_DMGAN/valid/single', """Path where to load the images """) tf.flags.DEFINE_integer('num_classes', 50, # 20 for flowers """Number of classes """) tf.flags.DEFINE_integer('splits', 10, """Number of splits """) tf.flags.DEFINE_integer('batch_size', 3, "batch size")
and i turn the Line 79
img = scipy.misc.imresize(img, (299, 299, 3), interp='bilinear')
into
img = np.array(Image.fromarray(img).resize((299,299)))

cause misc.imresize does not work ,i have to change it.

i can surely catch images from the images_folder, but when I print the kl value ,some values in matrix will be Nan and the final values( mean and std) will be Nan in the end.Has anyone meet this problem and solved it?
btw,when i ran the IS code, it always takes me half an hour or more,does anyone know how to improve it?

Request for release of pretrained discriminator checkpoints

Hi,
Would it be possible for you to release the pretrained discriminator checkpoints that correspond to the pretrained DM-GAN generator checkpoints that have been released? I wanted to attempt to fine tune the DM-GAN model (training both discriminators and generators), and it would be much easier to start from the provided pretrained checkpoints than having to re-train from scratch.

How do you implement the R precision for CUB dataset?How many images have you generated?

Please someone help me.
The validation set for CUB is 2933, How many images have you generated to calculate the R precision? 30000?
in code, How much have you considered the size of R? Is the condition of 30,000 considered?
`R_count = 0
R = np.zeros(2928)
.
.
.

  if R_count >= 30000:
     sum = np.zeros(8)
     np.random.shuffle(R)
       for i in range(8):
           sum[i] = np.average(R[i * 3000:(i + 1) * 3000- 1])
            R_mean = np.average(sum)*100
            R_std = np.std(sum)*100
            print("R mean:{:.2f} std:{:.2f}".format(R_mean, R_std))
            cont = False

About R-precision

When I reproduced the values of the R-precision in experiment, I had doubts. I used DMGAN to test 3w images of CUB with the code for R-precision. When R=1, the R-precision is 4%. I would like to ask if the R-precisionin your paper is for 1? How many test images are there? If I want to reproduce your experimental results, what do I need to pay attention to when testing?

inception_finetuned_models.zip

I can't download the file "inception_finetuned_models.zip" for the calculation of IS. Can you give me this file ?

RuntimeError: Error(s) in loading state_dict for RNN_ENCODER: While copying the parameter named "encoder.weight", whose dimensions in the model are torch.Size([1, 300]) and whose dimensions in the checkpoint are torch.Size([5450, 300]).

When I do Validation,follow your steps:go into code/ folder，python main.py --cfg cfg/eval_bird.yml --gpu 0，But there was showingRuntimeError: Error(s) in loading state_dict for RNN_ENCODER:
While copying the parameter named "encoder.weight", whose dimensions in the model are torch.Size([1, 300]) and whose dimensions in the checkpoint are torch.Size([5450, 300]).
When implemented python main.py --cfg cfg/eval_coco.yml --gpu 0，showing RuntimeError: Error(s) in loading state_dict for RNN_ENCODER:
While copying the parameter named "encoder.weight", whose dimensions in the model are torch.Size([1, 300]) and whose dimensions in the checkpoint are torch.Size([27297, 300]).
What is the reason why you should correct it

How to generate images from custom captions, using the pre-trained model?

ACM module

Thanks for your great works in T2I. After reading your paper and reproceduring your code, but i find something strange. I drew the loss of g_loss, d_loss, and accuracy of discriminator by visdom. The loss of D isn't decreasing continuously. Must I stop the training after 800 epoch? Can I stop early by obeserving the loss and the accuracy curve?

Problem about IS evaluation. 有关IS评价的问题

还有一个问题是，我看到代码中关于IS评测使用的是Tensorflow，如果使用PyTorch也是可以的吧。

Query regarding spectral.py

Respected @MinfengZhu and @Basasuya,

What is the actual need for forming the class SpectralNorm?
Why do we create the additional parameters u and v?
Why do we use L2 normalization and what is idea behind updating u,v & w?
What is indicated by power_iterations?

R_precision: why 10 in sum = np.zeros(10)

Hi there! Would be awesome if you can explain why do we initiate sum as array of length 10 for bird captions? I guess in case of coco the sum length will be different, right?

To be more precise I am referring to code/trainer.py
........................................................................................................
if R_count >= 30000:
sum = np.zeros(10)
.........................................................................................................

IS on the CUB dataset?

I have evaluate the pre-trained model provided by author, but i only got IS 4.62 on CUB. I want know is there any tricks in code?