edouardelasalles / srvp Goto Github PK

View Code? Open in Web Editor NEW

77.0 77.0 16.0 130 KB

Official implementation of the paper Stochastic Latent Residual Video Prediction

Home Page: https://sites.google.com/view/srvp/

License: Apache License 2.0

Python 99.73% Shell 0.27%

deep-learning generative-model icml-2020 latent-variable-models pytorch stochastic vae video-prediction

srvp's People

Contributors

Stargazers

Watchers

Forkers

blackpython dlwbm123 zengzhen gyq716 ry85 mdork perevalovds cl19951225 selimfeup mlia hanghang177 kyg0910 ofryliv bhavaniam rdaems ilzeamandaa

srvp's Issues

Training parameters to reproduce results

Dear authors,

firstly, thanks for providing the code to your interesting paper! Great stuff.

I was trying to start reproducing some of the results (aka figuring out how to use everything provided).
You provided most of the architectural choices/parameters in the paper, but I was wondering if I also figured it out correctly with using them here.

For example, from my understanding, to run the stochastic video prediction:

python train.py --dataset smmnist --data_dir $DATA_FOLDER --nc 1 --ny 20 --nz 20 --nt_cond 5 --seq_len 15 --nt_inf 5 --save_path $SAVE_FOLDER

I guess, you set nt_inf always to the same number of nt_cond, the number of conditioning frames?

Maybe you could provide some, if not all, lines to run the experiments in the README.md as well, as it makes it a lot easier to figure things out.

Greatly appreciated!

PS: Is there a way to monitor the progress? For me it seems it just does the whole training without any in between validation testing/outputting temporary results? Is this intended or user error?
I thought 'chkpt_interval' might do something as cmd parameter, but it does not seem to trigger anything.

How could I prepared the training data for a model for estimate the wild video‘s prediction.

@White-Link Hi, It is a great work! And I am a new at this field, so maybe the question is a bit unprofessional: How could I prepared the training data for a model for estimate the wild video‘s prediction, for example , I would like to predict any video I want, what should I do? Thanks a lot, Hu.

Training error

Hi!

Thank you for sharing your code for this work. I am getting the following error when I try to train on Moving MNIST (stochastic) :

Could you kindly point out the issue? Thank you.

Reproducing results on Stochastic MNIST

Hi,

I was trying to reproduce results from the paper on stochastic moving MNIST dataset. The results I got when I run the code with the instructions on README file and your pretrained model:

Results:
psnr 16.066757 +/- 0.06360093271710796
ssim 0.75027704 +/- 0.0019019813119048063
lpips 0.13383576 +/- 0.0010925805679736199

However, the results are not matching with the paper. I assume "ours" is the shared model. Results for PSNR and SSIM on paper:
Ours 16.93 ± 0.07 0.7799 ± 0.0020

Am I doing something wrong with the code?

Thanks a lot.

Edit: I forgot to mention that I created the test set with command you shared on the Readme

Memory out on SM-MNIST

I trained the model easily by following your instructions,but i got "OSError: [Errno 12] Cannot allocate memory" when 319999/1100000。I have tried to set n_worker=0 and pin_memory=False,but it didn't work.So,I wonder how many cpu memories i need to train the model on SM-MNIST?(My CPU memories:80G)

Evaluation details / reproducing results

Hi,

I realized you have used 2 euler steps for KTH, Human3.6M and BAIR datasets. I want to ask what was the reason behind using 2 euler steps instead of 1?

If you have results for the model trained with 1 euler step and tested with 1 euler step on these datasets, can you share it? (I think you shared frame-wise metrics for KTH dataset in table 8.)

To be specific, it would be great if you could share:

results (PSNR,SSIM,LPIPS,FVD) for SRVP model trained and tested with 1 euler step on BAIR, Human3.6M datasets
FVD scores for SRVP model trained and tested with 1 euler step on KTH dataset.

Based on the explanations in Table 1, we assume that ours means that SRVP model trained with dt=1/2 and tested with dt=1 and ours-dt/2 means that SRVP model trained and tested with dt=1/2. Is this true?

If you share the trained models, we can evaluate them using your scripts to save you the trouble :)

Thanks in advance.

training time

Hi, I'm using an Nvidia T4 GPU, and I'm wondering whether I am experiencing reasonable training time. For example, when training on Moving MNIST (deterministic), I get approximately 2 hours for training 160 iterations. Turning on the apex did not help much.

I saw the README used 900000 iterations for training the Moving MNIST (deterministic), that seems impossible on my machine, so I am wondering whether my training time is actually reasonable, or I'm doing something wrong. How many GPU did you use, and how much training time did you end up spending for 900000 iterations given your GPU configuration?

Thank you!

Reproducing results on BAIR

Hi,

I do not want to mix up the issues, I open another issue for this. Again, thank you for providing the code.

The metrics for BAIR dataset is reported (Table 5) as 19.59 ± 0.27 (PSNR), 0.8196 ± 0.0084 (SSIM). However, when I run the test code with the instructions given in the README file (I add device 0 and batch size 32) and the pre-trained weights you shared, the results from the script are:

psnr 18.830994 +/- 0.274060565829277
ssim 0.80773425 +/- 0.008860558923333883
lpips 0.040237267 +/- 0.0024190666200593113

Do I miss something?

Thanks.

Reproducing Results on SVG (BAIR dataset)

Hi,

Thank you for providing the code of the work.

In the paper, it is stated that

"for each test sequence, sampling from the tested model a given number (here, 100) of
possible futures and reporting the best performing sample
against the true video"

What I understand from this is that generate hundred samples for each test sequence and report the best SSIM value for each test sequence. Is this true?

In SVG evaluation, there is no metric report but the metrics are calculated. For BAIR dataset, the metrics for SSIM and PSNR is higher than you report on the paper (Table 5.). In the table, it is reported as 18.95 ± 0.26 (PSNR), 0.8058 ± 0.0088 (SSIM), however, if we take the best performing sequence from 100 samples, metrics jump to 22.70 (PSNR), 0.8910 (SSIM). (Weights from SVG repository are used.) Am I doing something in this evaluation?

If you clarify these points, I would be grateful.
Thanks.

Doubt regarding evaluation of FVD values

Hi,

Thanks for sharing the codes of your paper.

I was going through the test.py script, there while calculating fvd you had concatenated both conditioning frames and the generated frames to calculate the fvd. Shouldn't it only be the generated frames we should use?

Thanks

edouardelasalles / srvp Goto Github PK

srvp's People

Contributors

Stargazers

Watchers

Forkers

srvp's Issues

Training parameters to reproduce results

How could I prepared the training data for a model for estimate the wild video‘s prediction.

Training error

Reproducing results on Stochastic MNIST

Memory out on SM-MNIST

Evaluation details / reproducing results

training time

Reproducing results on BAIR

Reproducing Results on SVG (BAIR dataset)

Doubt regarding evaluation of FVD values

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent