edouardelasalles / srvp Goto Github PK
View Code? Open in Web Editor NEWOfficial implementation of the paper Stochastic Latent Residual Video Prediction
Home Page: https://sites.google.com/view/srvp/
License: Apache License 2.0
Official implementation of the paper Stochastic Latent Residual Video Prediction
Home Page: https://sites.google.com/view/srvp/
License: Apache License 2.0
Dear authors,
firstly, thanks for providing the code to your interesting paper! Great stuff.
I was trying to start reproducing some of the results (aka figuring out how to use everything provided).
You provided most of the architectural choices/parameters in the paper, but I was wondering if I also figured it out correctly with using them here.
For example, from my understanding, to run the stochastic video prediction:
python train.py --dataset smmnist --data_dir $DATA_FOLDER --nc 1 --ny 20 --nz 20 --nt_cond 5 --seq_len 15 --nt_inf 5 --save_path $SAVE_FOLDER
I guess, you set nt_inf always to the same number of nt_cond, the number of conditioning frames?
Maybe you could provide some, if not all, lines to run the experiments in the README.md as well, as it makes it a lot easier to figure things out.
Greatly appreciated!
PS: Is there a way to monitor the progress? For me it seems it just does the whole training without any in between validation testing/outputting temporary results? Is this intended or user error?
I thought 'chkpt_interval' might do something as cmd parameter, but it does not seem to trigger anything.
@White-Link Hi, It is a great work! And I am a new at this field, so maybe the question is a bit unprofessional: How could I prepared the training data for a model for estimate the wild video‘s prediction, for example , I would like to predict any video I want, what should I do? Thanks a lot, Hu.
Hi,
I was trying to reproduce results from the paper on stochastic moving MNIST dataset. The results I got when I run the code with the instructions on README file and your pretrained model:
Results:
psnr 16.066757 +/- 0.06360093271710796
ssim 0.75027704 +/- 0.0019019813119048063
lpips 0.13383576 +/- 0.0010925805679736199
However, the results are not matching with the paper. I assume "ours" is the shared model. Results for PSNR and SSIM on paper:
Ours 16.93 ± 0.07 0.7799 ± 0.0020
Am I doing something wrong with the code?
Thanks a lot.
Edit: I forgot to mention that I created the test set with command you shared on the Readme
I trained the model easily by following your instructions,but i got "OSError: [Errno 12] Cannot allocate memory" when 319999/1100000。I have tried to set n_worker=0 and pin_memory=False,but it didn't work.So,I wonder how many cpu memories i need to train the model on SM-MNIST?(My CPU memories:80G)
Hi,
I realized you have used 2 euler steps for KTH, Human3.6M and BAIR datasets. I want to ask what was the reason behind using 2 euler steps instead of 1?
If you have results for the model trained with 1 euler step and tested with 1 euler step on these datasets, can you share it? (I think you shared frame-wise metrics for KTH dataset in table 8.)
To be specific, it would be great if you could share:
Based on the explanations in Table 1, we assume that ours means that SRVP model trained with dt=1/2 and tested with dt=1 and ours-dt/2 means that SRVP model trained and tested with dt=1/2. Is this true?
If you share the trained models, we can evaluate them using your scripts to save you the trouble :)
Thanks in advance.
Hi, I'm using an Nvidia T4 GPU, and I'm wondering whether I am experiencing reasonable training time. For example, when training on Moving MNIST (deterministic), I get approximately 2 hours for training 160 iterations. Turning on the apex did not help much.
I saw the README used 900000 iterations for training the Moving MNIST (deterministic), that seems impossible on my machine, so I am wondering whether my training time is actually reasonable, or I'm doing something wrong. How many GPU did you use, and how much training time did you end up spending for 900000 iterations given your GPU configuration?
Thank you!
Hi,
I do not want to mix up the issues, I open another issue for this. Again, thank you for providing the code.
The metrics for BAIR dataset is reported (Table 5) as 19.59 ± 0.27 (PSNR), 0.8196 ± 0.0084 (SSIM). However, when I run the test code with the instructions given in the README file (I add device 0 and batch size 32) and the pre-trained weights you shared, the results from the script are:
psnr 18.830994 +/- 0.274060565829277
ssim 0.80773425 +/- 0.008860558923333883
lpips 0.040237267 +/- 0.0024190666200593113
Do I miss something?
Thanks.
Hi,
Thank you for providing the code of the work.
In the paper, it is stated that
"for each test sequence, sampling from the tested model a given number (here, 100) of
possible futures and reporting the best performing sample
against the true video"
What I understand from this is that generate hundred samples for each test sequence and report the best SSIM value for each test sequence. Is this true?
In SVG evaluation, there is no metric report but the metrics are calculated. For BAIR dataset, the metrics for SSIM and PSNR is higher than you report on the paper (Table 5.). In the table, it is reported as 18.95 ± 0.26 (PSNR), 0.8058 ± 0.0088 (SSIM), however, if we take the best performing sequence from 100 samples, metrics jump to 22.70 (PSNR), 0.8910 (SSIM). (Weights from SVG repository are used.) Am I doing something in this evaluation?
If you clarify these points, I would be grateful.
Thanks.
Hi,
Thanks for sharing the codes of your paper.
I was going through the test.py script, there while calculating fvd you had concatenated both conditioning frames and the generated frames to calculate the fvd. Shouldn't it only be the generated frames we should use?
Thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.