ramakanth-pasunuru / video_captioning_rl Goto Github PK

View Code? Open in Web Editor NEW

43.0 43.0 13.0 30 KB

Code and Models for paper "Reinforced Video Captioning with Entailment Rewards (EMNLP 2017)"

License: MIT License

Python 100.00%

video_captioning_rl's People

Contributors

Stargazers

Watchers

Forkers

stephen-adams sususushi shwetabhardwaj44 apollo1840 zhuzhutingru123 ammieqi ivanfei andrew-zhu mengqidyangge 2226171237 sjq-shanghai caidhome crystalsixone

video_captioning_rl's Issues

your results presented at here is not the same in your paper

can you check your results, thank you

how to extract Resnet feature for new dataset?

excellent work. However, if I want to use the in a new dataset, how can I extract the resnet feature? Can you give me some hints? Thank you very much.

Code Question

There is an error in your code.

video_captioning_rl/models/seq2seq_atten.py

frame_packed = nn.utils.rnn.pack_padded_sequence(frames, flengths, batch_first=True)

ValueError: some of the strides of a given numpy array are negative. This is currently not supported, b
ut will be added in future releases.

Hello, thank you very much for your contribution. I tried to reproduce your code on the MSVD data set, but I failed to try for a long time. Therefore, I would like to ask you what code did you use when extracting features, and how did you merge the two features into 4086 dimensions in the end?

questions about pretrained baseline model 'msrvtt_model_base.pth'

Hello, Sir:

     When I employ the pretrained baseline-XE model using 'python main.py --mode test --load_path "path_to_model_ending_with_*.pth" --beam_size 5 ', there are losts of [UNK] and the result is too low.

    However, the results employed by pretrained model like 'CIDEr-RL' and 'CIDEnt-RL' is good.

   Are there any difference of command between baseline-XE and CIDEr-RL? Or is there something wrong with the baseline-XE model?

motion features for MSR-VTT videos

I cannot download the ResNet-152 frame-level features + ResNeXt-101 motion features for MSR-VTT videos data from the Google website. Can you share it with me in any other way?

thanks

Motion features

Excuse me, how do you extract the ResNeXt-101 motion features?

MSVD part

Thank you for your excellent work. Can you provide the code to train and test on MSVD?

No module named 'automatic_evaluation'

Hi. May I ask if you forget to upload the file named automatic_evaluation ? Thx.

request entailment classifier model

Hello, I cloned your code and found a missing entailment classifier model during runtime. I tried to find a suitable entailment classifier model on the Internet but failed to find a suitable model. So I hope you can share your entailment classifier model to me, thank you very much.

Data preprocessing code

Thank you for your fantastic work, I wander whether you can provide the data processing code.

cannot reproduce result in your paper with InceptionV4

Hi, I tried to use InceptionV4 as CNN encoder to train the Baseline-XE model, but I cannot get the results like your paper. When finishing trarining, I got three .pth, i.e., epoch1, epoch2, and epoch3. I didn't konw which is more better, so I tested them respectively. I found epoch3.pth is the best among three. But the results are still lower than your paper.
I got the results as follows:
2020-04-22 18:00:49,045:INFO::Results:
2020-04-22 18:00:49,046:INFO::BLEU-4:0.3745473407866488
2020-04-22 18:00:49,046:INFO::METEOR:0.26116567752846037
2020-04-22 18:00:49,046:INFO::ROUGE_L:0.5878605132246427
2020-04-22 18:00:49,046:INFO::CIDEr:0.4295072766573666
2020-04-22 18:00:49,046:INFO::AVG:0.4132702020492796,
which are much lower than your paper, B4 38.6, M 27.7, R-L 59.5, and C 44.6.

I trained the model with python3.6.10, pytorch0.3.1, and TitanXP gpu.