Comments (16)
When using isv, it is still far lower than expectation.
python src/scripts/calc_metrics_for_dataset.py \
--fake_data_path datasets/UCF101/frames/trainval_32fps \
--mirror 1 --gpus 1 --resolution 256 --metrics isv2048_ucf --verbose 1 --use_cache 0
Real data options:
{'class_name': 'training.dataset.VideoFramesFolderDataset', 'path': None, 'cfg': {'max_num_frames': 10000}, 'xflip': True, 'resolution'
: 256, 'use_labels': False}
Fake data options:
{'class_name': 'training.dataset.VideoFramesFolderDataset', 'path': 'datasets/UCF101/frames/trainval_32fps', 'cfg': {'max_num_frames': 10000}, 'xflip': False, 'resolution': 256, 'use_labels': False}
Launching processes...
Calculating isv2048_ucf...
dataset features items 1024 time 2m 27s ms/item 143.37
dataset features items 2048 time 4m 10s ms/item 100.84 {"results": {"isv2048_ucf_mean": 16.65230369567871, "isv2048_ucf_std": 0.7203830480575562}, "metric": "isv2048_ucf", "total_time": 379.
13585901260376, "total_time_str": "6m 19s", "num_gpus": 1, "snapshot_pkl": null, "timestamp": 1665998206.2202895}
from stylegan-v.
I got same issue. I used isv and got mean:16.668, std: 0.4938
from stylegan-v.
@anonymous202203 Hi, have you figured it out?
from stylegan-v.
Not with this repository.
I adopt another implementation of FVD (https://github.com/pfnet-research/tgan2), which obtains a reasonable IS score for me.
from stylegan-v.
@anonymous202203 do you have its pretrained model on ucf101? Their implementation is a little bit confused if without pretrained model config
from stylegan-v.
@anonymous202203 can you share your parameters of tgan2? I test it on the original UCF-101, the IS only got around 30, which is supposed to be 60.
from stylegan-v.
Hi @anonymous202203 , ok, this is serious. Do you think it is possible to share your version of the UCF dataset with me (my email is [email protected])? Our ISV implementation should be identical to the one from TGANv2 and I was checking all the activations to verify that it's indeed true. In our case, it was giving scores of ~90 for real data as far as I remember (UPD: yeah, I just took a look at Table 5, it is 97).
Also, IS which you measured in your first comment is an image-based metric using an ImageNet-pretrained model, so it's not a surprise that it shows low values.
P.S. I apologize for not responding in time
from stylegan-v.
@anonymous202203 @martinriven
Ah, I think I might understand the issue: since you pretend that you evaluate on fake data, you are using just 2048 videos out of 10-11k ones, this is why a lot of classes are ommited, which makes IS being very unhappy. You should change num_gen
argument here to be (at least approximately) equal to the number of images in your UCF. Otherwise, many classes are not covered an IS is low.
We used just 2048 videos for fake data to be comparable with prior work. Also, if the classes are randomly (and thus evenly) distributed in those 2048 videos, then Inception Score is not that bad. But when you use fake data, then just the first 2048 videos are taken from the dataset because the dataloader is run with with shuffle=False
during evaluation.
from stylegan-v.
@universome yeah, you are right. when setting to 13320, it got almost 60, thx.
from stylegan-v.
@martinriven 60 is still too low. I've just recomputed the metric with num_gen=13320
and got ISV=84.13. I suspect that there could be an issue of how you pre-process the UCF dataset.
from stylegan-v.
@universome Well, from LDVD GAN, if set resolution to 128, the IS should be around 80-90. Why u got 84 when set resolution to 256? Shouldn't it be bigger?
from stylegan-v.
@martinriven the underlying C3D model resizes all the input videos to the 112x112 resolution, so it would be producing almost identical results for anything higher than 112x112 (depending on the downsampling scheme you use)
from stylegan-v.
@universome you are right, there should be no difference when resolution higher than 112. I wonder how you process the data? I chose central 32 frame of the each video(due to the training scheme), center crop and resize to 128 and 256 resolution. I could only get IS of 60.
from stylegan-v.
@liangbingzhao we use the full videos during training. For the metric calculation above, since it is assumed that those videos are generated by the generator (i.e. we pass the real data via fake_data_path
), they are assumed to be just 16-frames long and we simply extract the first 16 frames from each video. When I tried running the above script with extracting random 16 consecutive frames from each video, ISV was ~85. Do you store videos as JPEG images (and if so, which JPEG quality did you use while converting MP4 into JPEG)?
from stylegan-v.
I first crop UCF to 240*240, and store as MP4. Then I use your script to convert UCF videos to JPEG. I tried store 128 and 256 images, both only got ISV 60.
from stylegan-v.
@liangbingzhao As far as I remember, we simply downloaded the original UCF, videos and then preprocessed into a collection of JPG images with our script. Can it be the case that you accidentally decreased the video quality (e.g., by using a too severe compression) while converting to MP4?
from stylegan-v.
Related Issues (20)
- Large GPU memory consumption at the beginning of training HOT 1
- Questions about details of hyperparameters in AlignedTimeEncoder HOT 4
- Notebook release
- FVD calculation is not deterministic HOT 1
- About FaceForensics Dataset HOT 2
- Linearly spaced periods stated in Appendix B.3 is not correct HOT 1
- Face Forensics dataset preprocessing HOT 1
- FileNotFoundError: [Errno 2] No such file or directory: 'src\\experiment_config.yaml'
- Run stylegan-v Conditionally
- Question about FVD calculation HOT 2
- Question about the release of Pre-trained checkpoints HOT 2
- I tried to generate fake video samples with your pretrained weights but....
- ImportError: cannot import name 'CoordFuser' from 'src.training.layers' HOT 4
- TypeError: Descriptors cannot be created directly
- Projection of real video with multiple frames
- RainbowJelly dataset
- Problem regarding true data statics computation for FVD HOT 4
- Rainbow jelly isn't available
- MOCOGAN: training isn't runing
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stylegan-v.