Code Monkey home page Code Monkey logo

dm-gan's People

Contributors

minfengzhu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

dm-gan's Issues

About the metric IS

I have loaded your code and I didnt revise any, however, I cannt get the right IS score but I can get the right FID score.
So,where my wrong friends(I got the IS only 1.19)?I need your help!

error when using captions from text file for validation

Hi,

I'm trying to load captions from txt files during validation (i.e. the if block on line 223 in datasets.py).
However it gives me the following error:
RuntimeError: Error(s) in loading state_dict for RNN_ENCODER: size mismatch for encoder.weight: copying a param of torch.Size([26835, 300]) from checkpoint, where the shape is torch.Size([27297, 300]) in current model.

This has also been reported in the AttnGAN issues: taoxugit/AttnGAN#45
Would be great if you could tell me why this error pops up using txt files.

The validation works if I use the default method.

RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 4 and 1 in dimension 2 at /tmp/pip-req-build-0vti0ns4/aten/src/THC/generic/THCTensorMath.cu:71

Traceback (most recent call last):
File "main.py", line 148, in
algo.train()
File "/home/tjl/recurrence/DM-GAN-master/code/trainer.py", line 291, in train
sent_emb, real_labels, fake_labels)
File "/home/tjl/recurrence/DM-GAN-master/code/miscc/losses.py", line 143, in discriminator_loss
cond_real_logits = netD.COND_DNET(real_features, conditions)
File "/home/tjl/anaconda3/envs/t1/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/tjl/recurrence/DM-GAN-master/code/model.py", line 627, in forward
h_c_code = torch.cat((h_code, c_code), 1)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 4 and 1 in dimension 2 at /tmp/pip-req-build-0vti0ns4/aten/src/THC/generic/THCTensorMath.cu:71
由于我的电脑是2080显卡,所以无法使用pytorch0.4.0,使用pytorch1.2.0后报错,希望作者可以解答一下。

Why is my FID so high using the FID code provided there?

I use 'DM-GAN/eval/FID/find_score.py' to calculate the FID of AttnGAN on COCO dataset. Before that I set the B_VALIDATION = TRUE ,changed the BATCH_SIZE=50, then trained my model 120 epoches to produce evaluation image. I didn't change anywhere in find_score.py . However, I got FID=62.00581731786599!Is it because the coco_val.npz model I used which can't be used on AttnGAN. Has anyone ever been in this situation and solved it? 😔

Model Download Links Do Not Exist

DM-GAN for coco or bird don not exist. Furthermore, FID for bird and coco (bird_val.npz and coco_val.npz) also don not find it! Are these web download links wrong? I would appreciate it if you could verify the relevant links.

Problem about training time. 有关训练时间的问题。

Thanks a lot for your work. Is this model trained on single GPU? And how long does it take for training 600 epochs?
感谢你们做的这项工作。我想问的问题是,模型训练是单个GPU吗,训练600个epochs大概需要多长时间。我在12G的GPU上大概训了2天,这个时间是否正常。感谢~

gen_example function

Hi, Thanks for your great work i wanna run gen_example function i make a configuration file for it but when i started i get an error
is there any way you suggest to run it?

thanks in advance

When i try to train the the birds model, I encountered this problem. Coulde you help me solve this problem? RuntimeError: Error(s) in loading state_dict for RNN_ENCODER: size mismatch for encoder.weight: copying a param of torch.Size([1, 300]) from checkpoint, where the shape is torch.Size([5450, 300]) in current model.

When i try to train the the birds model, I encountered this problem.

Coulde you help me solve this problem?

RuntimeError: Error(s) in loading state_dict for RNN_ENCODER:
size mismatch for encoder.weight: copying a param of torch.Size([1, 300]) from checkpoint, where the shape is torch.Size([5450, 300]) in current model.

Evaluation Metrics

Hi, there. I'm replicating DM-GAN on CUB but got bad results(R-Precision: 57.07(±0.71); FID: 25.68) after training 700 epochs. Besides, I got even worse results(R-Precision: 35.45(±0.64); FID: 42.57) using the checkpoint(bird_DMGAN.pth) you provided. By way of generation, I generated 30,000 images using the scripts provided. Can you please help me with this issue?
btw, I have no access to the inception model for IS evaluation on google drive, can you please provide another download link?
Thanks!
Here is my config:

CONFIG_NAME: 'DMGAN'

DATASET_NAME: 'birds'
DATA_DIR: '../data/birds'
GPU_ID: 7
WORKERS: 1

B_VALIDATION: True  # True  # False
TREE:
    BRANCH_NUM: 3


TRAIN:
    FLAG: False
    NET_G: '../models/bird_DMGAN.pth'
    B_NET_D: False
    BATCH_SIZE: 10
    NET_E: '../DAMSMencoders/bird/text_encoder200.pth'


GAN:
    DF_DIM: 32
    GF_DIM: 64
    Z_DIM: 100
    R_NUM: 2

TEXT:
    EMBEDDING_DIM: 256
    CAPTIONS_PER_IMAGE: 10
    WORDS_NUM: 25

datasets assert

I try to run the code, but the path of the datasets is wrong. Can you show me you right path?

Evaluation on CUB and COCO dataset ?

Hi author, i have a question that how many epochs you train to obtain IS and FID value with release pretrained model? (800 epochs for CUB and 200 epochs for COCO (in config file) OR 600 epochs for CUB and 120 epochs for COCO (in paper)) ? Many thanks.

CUDA Driver version problem.

When i try to train the the birds model, I encountered this problem.
screenshot

CUDA driver version is insufficient for CUDA runtime version at torch/csrc/cuda/Module.cpp:32

The sequence of resblock and upblock in NextstageG

Dear researcher, I have a question about the sequence of resblock and upblock. In your paper the output of Response Gate go firsly to a upblock then towards two resblocks, but in the code it seems go resblocks frist. How is it going? Does the order matters?

assert datasets和维度不匹配问题

在复现过程中,我出现了两个问题:
1、严格按照说明放置模型文件和数据文件,出现了assert datasets错误。
2、出现了维度不匹配问题,不知道是模型位置问题还是模型提供错误。
RuntimeError: Error(s) in loading state_dict for RNN_ENCODER: size mismatch for encoder.weight: copying a param of torch.Size([1, 300]) from checkpoint, where the shape is torch.Size([5450, 300]) in current model.“data”

IS value is Nan

I tried to run the IS code in tf19.12 and got some problem
Here is my setting in inception_score_birds:
tf.flags.DEFINE_string('checkpoint_dir','./model.ckpt', """Path where to read model checkpoints.""") tf.flags.DEFINE_string('image_folder', './bird_DMGAN/valid/single', """Path where to load the images """) tf.flags.DEFINE_integer('num_classes', 50, # 20 for flowers """Number of classes """) tf.flags.DEFINE_integer('splits', 10, """Number of splits """) tf.flags.DEFINE_integer('batch_size', 3, "batch size")
and i turn the Line 79
img = scipy.misc.imresize(img, (299, 299, 3), interp='bilinear')
into
img = np.array(Image.fromarray(img).resize((299,299)))

cause misc.imresize does not work ,i have to change it.

i can surely catch images from the images_folder, but when I print the kl value ,some values in matrix will be Nan and the final values( mean and std) will be Nan in the end.Has anyone meet this problem and solved it?
btw,when i ran the IS code, it always takes me half an hour or more,does anyone know how to improve it?

Request for release of pretrained discriminator checkpoints

Hi,
Would it be possible for you to release the pretrained discriminator checkpoints that correspond to the pretrained DM-GAN generator checkpoints that have been released? I wanted to attempt to fine tune the DM-GAN model (training both discriminators and generators), and it would be much easier to start from the provided pretrained checkpoints than having to re-train from scratch.

How do you implement the R precision for CUB dataset?How many images have you generated?

Please someone help me.
The validation set for CUB is 2933, How many images have you generated to calculate the R precision? 30000?
in code, How much have you considered the size of R? Is the condition of 30,000 considered?
`R_count = 0
R = np.zeros(2928)
.
.
.

  if R_count >= 30000:
     sum = np.zeros(8)
     np.random.shuffle(R)
       for i in range(8):
           sum[i] = np.average(R[i * 3000:(i + 1) * 3000- 1])
            R_mean = np.average(sum)*100
            R_std = np.std(sum)*100
            print("R mean:{:.2f} std:{:.2f}".format(R_mean, R_std))
            cont = False

`

About R-precision

When I reproduced the values of the R-precision in experiment, I had doubts. I used DMGAN to test 3w images of CUB with the code for R-precision. When R=1, the R-precision is 4%. I would like to ask if the R-precisionin your paper is for 1? How many test images are there? If I want to reproduce your experimental results, what do I need to pay attention to when testing?

RuntimeError: Error(s) in loading state_dict for RNN_ENCODER: While copying the parameter named "encoder.weight", whose dimensions in the model are torch.Size([1, 300]) and whose dimensions in the checkpoint are torch.Size([5450, 300]).

When I do Validation,follow your steps:go into code/ folder,python main.py --cfg cfg/eval_bird.yml --gpu 0,But there was showingRuntimeError: Error(s) in loading state_dict for RNN_ENCODER:
While copying the parameter named "encoder.weight", whose dimensions in the model are torch.Size([1, 300]) and whose dimensions in the checkpoint are torch.Size([5450, 300]).
When implemented python main.py --cfg cfg/eval_coco.yml --gpu 0,showing RuntimeError: Error(s) in loading state_dict for RNN_ENCODER:
While copying the parameter named "encoder.weight", whose dimensions in the model are torch.Size([1, 300]) and whose dimensions in the checkpoint are torch.Size([27297, 300]).
What is the reason why you should correct it

ACM module

Thanks for your great works in T2I. After reading your paper and reproceduring your code, but i find something strange. I drew the loss of g_loss, d_loss, and accuracy of discriminator by visdom. The loss of D isn't decreasing continuously. Must I stop the training after 800 epoch? Can I stop early by obeserving the loss and the accuracy curve?

Query regarding spectral.py

Respected @MinfengZhu and @Basasuya,

  1. What is the actual need for forming the class SpectralNorm?
  2. Why do we create the additional parameters u and v?
  3. Why do we use L2 normalization and what is idea behind updating u,v & w?
  4. What is indicated by power_iterations?

R_precision: why 10 in sum = np.zeros(10)

Hi there! Would be awesome if you can explain why do we initiate sum as array of length 10 for bird captions? I guess in case of coco the sum length will be different, right?

To be more precise I am referring to code/trainer.py
........................................................................................................
if R_count >= 30000:
sum = np.zeros(10)
.........................................................................................................

IS on the CUB dataset?

I have evaluate the pre-trained model provided by author, but i only got IS 4.62 on CUB. I want know is there any tricks in code?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.