hzwer / practical-rife Goto Github PK

View Code? Open in Web Editor NEW

532.0 532.0 59.0 3.04 MB

We are developing more practical approach for users based on RIFE.

License: MIT License

Python 92.01% Jupyter Notebook 7.99%

practical-rife's Introduction

Hi there 👋

I used to be an algorithm contest player NOI🥈, ICPC-regional🏅️.
I worked at MEGVII Research From 2017 to 2023. Currently I work at StepFun. I received my B.S. degree from Peking Univerisity in 2020.

Google Scholar, 知乎, 算法博客, Email, CV

Main Projects:

Cooperation Projects:

Service: CVPR22-24/ECCV22-24/ICCV23/AAAI23-24/NeurIPS23-24/ICLR24/ICML24/WACV24/ACCV24/SIGGRAPH/TIP/TPAMI/TOMM

practical-rife's People

Contributors

Stargazers

Watchers

Forkers

myeldib ywu40 styler00dollar sadnow ajichand2009 jincheng-ai faberman seawatcher nadimra ovshake gavine7 liuyingbin123 joolstorrentecalo capsulers fyr91 everythinging hithereai aimanifest ajuddgent grigar45 asygr wojtat lqsyj steelywing ray2000ray songsonggithub duongna21 nightingalesystem dongmaicle narutoua adambear enterprisium q8sh2ing syskey26 shinshin86 immersive-collective brahianrosswill thtskaran imkyeongbin arifjr599 princepbgmi jb2020-super nikitos569 kkeenee mengnumbal3375 elluifx meisa233 audreylin999 kingmingbo shivamb25 winnerking2010 ziengri terrance-wang jianlong-yuan petersheeper tbotan

practical-rife's Issues

[Google Colab error] ERROR: No matching distribution found for torchvision==0.7.0

Hello!
I tried to with Google Colab using Colab_demo.ipynb as a reference and encountered the following error.

ERROR: Could not find a version that satisfies the requirement torchvision==0.7.0 (from versions: 0.1.6, 0.1.7, 0.1.8, 0.1.9, 0.2.0, 0.2.1, 0.2.2, 0.2.2.post2, 0.2.2.post3, 0.12.0, 0.13.0, 0.13.1, 0.14.0, 0.14.1, 0.15.0, 0.15.1, 0.15.2, 0.16.0)
ERROR: No matching distribution found for torchvision==0.7.0

As a result, I was able to start using it successfully by deleting the line torchvision==0.7.0 from requirements.txt.

For example, I was thinking that if someone like me tries to use Google Colab in the future, I could prepare a requirements_google_colab.txt so that they can try it without encountering any problems, what do you think?
(Maybe more requirements.txt to manage is not a good thing...)

May i know what are the input resolution for both training and testing on 4.x models?

如何转换模型为TorchScript格式？

你好：

我写了个python脚本来转换模型为torchscript格式（cvtTorchScript.zip)，当我用torch.script.trace()函数转换时发生错误，报告系统内存不够（错误信息，torch.jit.trace.zip),然后我改用torch.jit.script()来转换，同样发生错误（错误信息torch.jit.script.zip），所以请问下我该如何转换模型为TorchScript格式，谢谢！

请问针对两张图片（小动作）能做到实时推理吗？

请教下，如果我有两张图片，他们之间的动作距离差距不大的情况下，怎样可以做到比较快速的推理插帧呢？

能否将V2.3的网络和模型地址也放在这个仓库呢？

经过测试对比，发现V2.3模型在某些场景下生成的视觉效果比V4.6还要优秀一些，但是是基于ncnn-vulkan的C++代码进行测试的，想要找python代码发现在这个仓库并没有提供，所以大佬能否将V2.3的网络和模型地址的获取网址也放在这里呢？感谢！

Do you still use block_tea while you train it? (not any issues, but just a question)

Thank you for this awesome application. This is just a quesiton( not any issues at all), Do you still use block_tea in training phase, which we can see in IFNet of your original RIFE repository. And if so, Did you train the 'block_tea' at outside of those codes? According to your paper, I thought you had been using guided mask something. So I was thinking you were doing it when the 'block_tea' was in training phase, prior to training the main model 'flownet'. Is this correct?

How can we use Frame Interpolation and Video Enhancement at the same time?

I guess many people will want this function.

I found that when i downloaded SAFA 0.5 and saved it to pulled project folder, it overwrites existing pkl file.

So we need one more step to use SAFA 0.5 and RIFE 4.x. I hope you people can change this.

Heavy distortion with static object among dynamic background

The clip from Your Name. shows that the vertical bars are distorted by the horizontally moving train.
This is witnessed on v4.6 and v4.4. Using Flowframes with v1.8 and v3.1, only some frames completely lose the bars, and there are just minor distortions.

This effect can also be seen with textured horizonal line against vertical movement.

horizon.mp4

能否导出onnx export？

能否导出onnx，并使用onnx进行推理呢，代码示例？
Can I export onnx and use onnx for inference? Code example?

Fine tuning

Hi, thank you for making your code available. The latest model release (v4.6) works really well for my project.
I would like to fine tune the model on my dataset.
Do you plan to make the script for the fine tuning available any time soon?
Thank you.

Issue with new models

I'm not sure if this is the correct place to raise this issue and tbh I'm pretty sure alotta people have raised it.
But why are the results of the new models still consistently worse than 2.3?
It seems that quality has peaked around that point and both 3.1 3.8 and 3.9 performed worse than 2.3 quality wise, although being much better speed wise.
Is it an issue with training or a change in code?

Support copying subtitles

I think a great and easy change would be to support copying existing subtitles from the original video file to the resulting interpolated one.

I'm pretty sure this can be done by simply adding -c:s copy to line 31 and line 40 of inference_video.py, though I didn't want to make a pull request in case you would want to add additional checks in case subtitle copying failed for some reason. For example, not all container formats support embedded subtitles, though I know mkv does, and possibly mp4?

numpy

hello！ when I run the comman "python inference_video.py --multi=2 --video=video.mp4" It goes wrong

how can I solve this?
thank you!

import nori2 as nori

ModuleNotFoundError: No module named 'nori2'

I am trying to start training the model, but some modules do not exist and are not located. Where can I get this module??

RuntimeError: Argument #8: Padding size should be less than the corresponding input dimension, but got: padding (5, 5) at dimension 2 of input 5

Hey friends!

Ran into a really frustrating issue that I can't figure out for the life of me. I've uninstalled and reinstalled dependencies, tried multiple videos and multiple models. I'm running the code on an M2 MAC which was working flawlessly until I closed and reopened VSCode. I'm very frustrated with myself for not being able to figure it out. Please help. Thank you in advance.

File ~/Desktop/AI/Practical-RIFE/interpolator_helper.py:244, in interpolate_inference_video(args)
242 I0_small = F.interpolate(I0, (32, 32), mode='bilinear', align_corners=False)
243 I1_small = F.interpolate(I1, (32, 32), mode='bilinear', align_corners=False)
--> 244 ssim = ssim_matlab(I0_small[:, :3], I1_small[:, :3])
246 break_flag = False
247 if ssim > 0.996:

File ~/Desktop/AI/Practical-RIFE/model/pytorch_msssim/init.py:107, in ssim_matlab(img1, img2, window_size, window, size_average, full, val_range)
104 img1 = img1.unsqueeze(1)
105 img2 = img2.unsqueeze(1)
--> 107 mu1 = F.conv3d(F.pad(img1, (5, 5, 5, 5, 5, 5), mode='replicate'), window, padding=padd, groups=1)
108 mu2 = F.conv3d(F.pad(img2, (5, 5, 5, 5, 5, 5), mode='replicate'), window, padding=padd, groups=1)
110 mu1_sq = mu1.pow(2)

RuntimeError: Argument #8: Padding size should be less than the corresponding input dimension, but got: padding (5, 5) at dimension 2 of input 5

Completely messed up artifact with 4k image

python: 3.9.9
torch: 1.10.2+cu113

GPU: RTX 3080 Ti (12GB LHR)

Seems like this only happens when fp16 is used.

https://images2.imgbox.com/b3/18/F2Qhtru7_o.png

v3 模型和之前模型的区别

@KevenLee 不好意思我把你的问题移动到这

v3.8 interpolation results degrade against v2.4

Hi Hzwer, thank you for great work! I'm currently testing model v3.8 and frame interpolation looks rather peculiar in comparison to v2.4, here is the example:

https://vimeo.com/577224538

model.update中的loss_cons相关问题

您好！我在阅读v4.13和v4.14的代码时，发现RIFEv3_HD.py的Model 类的update函数中，存在loss_cons的损失，但是并没有见到与该损失计算和使用相关的代码。请问一下，（1）该loss是什么含义？（2）如果有想在预训练参数的基础上对模型进行微调，需要关注该loss吗？谢谢您的解答！

Model is not defined

File "/Users/x/Practical-RIFE/inference_video.py", line 96, in
model = Model()
NameError: name 'Model' is not defined

I downloaded the model and added it to a train_log folder. How do I define the model?

change v3 to v3.8 cause blurry

V3.0, V3.1 use 3 networks (flow, unet, context)
since V3.5 remove unet and context net, and produce visuall blurry result.

Add anime models

Is there any way to add interpolation for animation? The current version of course knows how to make animation smoother, but with a bunch of artifacts. Yes, version 4 has become better compared to the past, but it's still better to make a separate model for animation, as they did for example in real-esrgan

torch.cat ERROR

I'm using 4.1 model. When I try to interpolarate a video, I get such a ERROR:

2022-07-24 19:29:27,296 - one_line_shot_args - 772 - INFO - Initial New Interpolation Project: project_dir: C:/Users/Globefish/Downloads/Compressed/SVFI.3.2.Community\\u3010MAD\u3011 Genuine article \u3010\u4ffa\u30ac\u30a4\u30eb\u3011_1, INPUT_FILEPATH: C:/Users/Globefish/Desktop/\u3010MAD\u3011 Genuine article \u3010\u4ffa\u30ac\u30a4\u30eb\u3011_1.mp4
2022-07-24 19:29:27,296 - one_line_shot_args - 774 - INFO - Changing working dir to C:\Users\GLOBEF~1\DOWNLO~1\COMPRE~1\SVFI32~1.COM\Package
2022-07-24 19:29:27,296 - one_line_shot_args - 781 - WARNING - Not find selected ffmpeg, use default
2022-07-24 19:29:27,376 - one_line_shot_args - 255 - INFO - 
Input Video Info
{'index': 0, 'width': 960, 'height': 540, 'color_range': 'tv', 'color_space': 'smpte170m', 'color_transfer': 'smpte170m', 'color_primaries': 'smpte170m', 'r_frame_rate': '30/1', 'duration': '230.500000', 'nb_frames': '6915', 'disposition': {'default': 1, 'dub': 0, 'original': 0, 'comment': 0, 'lyrics': 0, 'karaoke': 0, 'forced': 0, 'hearing_impaired': 0, 'visual_impaired': 0, 'clean_effects': 0, 'attached_pic': 0, 'timed_thumbnails': 0}, 'tags': {'creation_time': '2022-07-24T04:25:41.000000Z', 'language': 'eng', 'handler_name': '\x1fMainconcept Video Media Handler', 'encoder': 'AVC Coding'}}
2022-07-24 19:29:27,376 - one_line_shot_args - 273 - INFO - Auto Find FPS in r_frame_rate: 30.0
2022-07-24 19:29:27,376 - one_line_shot_args - 280 - INFO - Auto Find frames cnt in nb_frames: 6915
2022-07-24 19:29:27,390 - one_line_shot_args - 832 - INFO - Check Interpolation Source, FPS: 30.0, TARGET FPS: 60.0, FRAMES_CNT: 13830, EXP: 1
2022-07-24 19:29:27,407 - one_line_shot_args - 859 - INFO - Buffer Size to 677
2022-07-24 19:29:29,077 - one_line_shot_args - 1412 - INFO - Start VRAM Test: 960x540 with scale 1.0
2022-07-24 19:29:30,384 - one_line_shot_args - 1420 - ERROR - VRAM Check Failed, PLS Lower your presets
Traceback (most recent call last):
File "C:\Users\GLOBEF~1\DOWNLO~1\COMPRE~1\SVFI32~1.COM\Package\one_line_shot_args.py", line 1416, in nvidia_vram_test
File "C:\Users\GLOBEF~1\DOWNLO~1\COMPRE~1\SVFI32~1.COM\Package\Utils\inference.py", line 174, in generate_interp
File "C:\Users\GLOBEF~1\DOWNLO~1\COMPRE~1\SVFI32~1.COM\Package\Utils\inference.py", line 87, in __make_inference
File "C:\Users\GLOBEF~1\DOWNLO~1\COMPRE~1\SVFI32~1.COM\Package\model\RIFE_HDv3.py", line 60, in inference
flow, mask, merged = self.flownet(imgs, timestep, scale_list)
File "C:\Users\GLOBEF~1\DOWNLO~1\COMPRE~1\SVFI32~1.COM\Package\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "C:\Users\GLOBEF~1\DOWNLO~1\COMPRE~1\SVFI32~1.COM\Package\model\IFNet_HDv3.py", line 99, in forward
f0, m0 = block[i](torch.cat((warped_img0[:, :3], warped_img1[:, :3], timestep, mask), 1), flow, scale=scale_list[i])
RuntimeError: Sizes of tensors must match except in dimension 3. Got 576 and 544 (The offending index is 2)

Traceback (most recent call last):
File "C:\Users\GLOBEF~1\DOWNLO~1\COMPRE~1\SVFI32~1.COM\Package\one_line_shot_args.py", line 2151, in <module>
File "C:\Users\GLOBEF~1\DOWNLO~1\COMPRE~1\SVFI32~1.COM\Package\one_line_shot_args.py", line 1871, in run
File "C:\Users\GLOBEF~1\DOWNLO~1\COMPRE~1\SVFI32~1.COM\Package\one_line_shot_args.py", line 1416, in nvidia_vram_test
File "C:\Users\GLOBEF~1\DOWNLO~1\COMPRE~1\SVFI32~1.COM\Package\Utils\inference.py", line 174, in generate_interp
File "C:\Users\GLOBEF~1\DOWNLO~1\COMPRE~1\SVFI32~1.COM\Package\Utils\inference.py", line 87, in __make_inference
File "C:\Users\GLOBEF~1\DOWNLO~1\COMPRE~1\SVFI32~1.COM\Package\model\RIFE_HDv3.py", line 60, in inference
flow, mask, merged = self.flownet(imgs, timestep, scale_list)
File "C:\Users\GLOBEF~1\DOWNLO~1\COMPRE~1\SVFI32~1.COM\Package\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "C:\Users\GLOBEF~1\DOWNLO~1\COMPRE~1\SVFI32~1.COM\Package\model\IFNet_HDv3.py", line 99, in forward
f0, m0 = block[i](torch.cat((warped_img0[:, :3], warped_img1[:, :3], timestep, mask), 1), flow, scale=scale_list[i])
RuntimeError: Sizes of tensors must match except in dimension 3. Got 576 and 544 (The offending index is 2)
INFO - ONE LINE SHOT ARGS 6.6.1 2021/6/26
Warning: Find Empty Args at 'ffmpeg_customized'
Warning: Find Empty Args at 'slow_motion_fps'
Warning: Find Empty Args at 'resize'
Warning: Find Empty Args at 'crop'
Warning: Find Empty Args at 'use_sr_model'
INFO - FP16 mode switch success
INFO - Loaded v3.x HD model.

If i use a "scale_factor" of 0.5 (though the original video is only 960x540), it can work well.
I'm using a RTX3060 Laptop with 6GB VRAM, I guess VRAM is enough. The same video can be use as input for 3.9 model with 1.0 scale factor.

I wonder how to solve this issue, thanks.

Anime model request

I've always seen plenty artefacts when interpolating anime vids using rife. Could you please make a model that's specially trained on anime to create the best results for this content? Besides rife is commonly used to interpolate anime too so I think a special model for anime would satisfy lots of people!

源代码和论文描述不一致

你好，源代码和论文中学习率和weight_decay描述的不一致，源代码中学习率最终下降到34-6，weight_decay是1e-3，但是在论文中这两者分别是3e-5和1e-4。请问在模型实际训练中使用的是哪种组合呢？

PNG export sudden slow down

I encountered a big slowdown when exporting pngs using inference_video.py.
The export process starts fast (30it/s) but after a few hundreds of frames it slow down to (6it/s). No gpu, cpu, disk or memory bottlenecks are observed. Trying adding [cv2.IMWRITE_PNG_COMPRESSION, 0] to cv2.imwrite('vid_out/{:0>7d}.png'.format(cnt), item[:, :, ::-1]) as cv2.imwrite('vid_out/{:0>7d}.png'.format(cnt), item[:, :, ::-1], [cv2.IMWRITE_PNG_COMPRESSION, 0]) barely made a difference.
I'm not experienced with python and opencv, my goal is to export lossless content.

scenes / cuts

Hi,

I fun side effect is that it also interpolates between cuts/scenes. Is there a way to avoid this? Set a threshold so it doesn't do any interpolating when the image changes too much?

Rife v4.14 lite

I notice that in terms of GPU usage there is little difference between v4.14 and v4.14 lite. @WolframRhodium explained that this because new "grouped convolution" methods are being used. It's generating a whole lot of error like those below.

[01/16/2024-14:50:26] [I] Skipped setting output types for some layers. Check verbose logs for more details.
[01/16/2024-14:50:26] [W] [TRT] Could not read timing cache from: C:/Program Files (x86)/SVP 4/rife\models\rife\rife_v4.14_lite.onnx.1920x1088_fp16_no-tf32_workspace8192_trt-9200_cudnn_I-fp16_O-fp16_NVIDIA-GeForce-RTX-4080_8ce99e37.engine.cache. A new timing cache will be generated and written.
[01/16/2024-14:50:26] [I] [TRT] Global timing cache in use. Profiling results in this builder pass will be stored.
[01/16/2024-14:50:44] [W] [TRT] Cache result detected as invalid for node: /block0/convblock/convblock.1/conv/Conv + block0.convblock.1.beta + /block0/convblock/convblock.1/Mul + /block0/convblock/convblock.1/Add + PWN(/block0/convblock/convblock.1/relu/LeakyRelu), LayerImpl: CaskConvolution, tactic: 0xecff17b04e8a0aaf
[01/16/2024-14:50:45] [W] [TRT] Cache result detected as invalid for node: /block0/convblock/convblock.2/conv/Conv + block0.convblock.2.beta + /block0/convblock/convblock.2/Mul + /block0/convblock/convblock.2/Add + PWN(/block0/convblock/convblock.2/relu/LeakyRelu), LayerImpl: CaskConvolution, tactic: 0xecff17b04e8a0aaf
[01/16/2024-14:50:45] [W] [TRT] Cache result detected as invalid for node: /block0/convblock/convblock.3/conv/Conv + block0.convblock.3.beta + /block0/convblock/convblock.3/Mul + /block0/convblock/convblock.3/Add + PWN(/block0/convblock/convblock.3/relu/LeakyRelu), LayerImpl: CaskConvolution, tactic: 0xecff17b04e8a0aaf
[01/16/2024-14:50:45] [W] [TRT] Cache result detected as invalid for node: /block0/convblock/convblock.4/conv/Conv + block0.convblock.4.beta + /block0/convblock/convblock.4/Mul + /block0/convblock/convblock.4/Add + PWN(/block0/convblock/convblock.4/relu/LeakyRelu), LayerImpl: CaskConvolution, tactic: 0xecff17b04e8a0aaf
[01/16/2024-14:50:46] [W] [TRT] Cache result detected as invalid for node: /block0/convblock/convblock.5/conv/Conv + block0.convblock.5.beta + /block0/convblock/convblock.5/Mul + /block0/convblock/convblock.5/Add + PWN(/block0/convblock/convblock.5/relu/LeakyRelu), LayerImpl: CaskConvolution, tactic: 0xecff17b04e8a0aaf
[01/16/2024-14:50:46] [W] [TRT] Cache result detected as invalid for node: /block0/convblock/convblock.6/conv/Conv + block0.convblock.6.beta + /block0/convblock/convblock.6/Mul + /block0/convblock/convblock.6/Add + PWN(/block0/convblock/convblock.6/relu/LeakyRelu), LayerImpl: CaskConvolution, tactic: 0xecff17b04e8a0aaf
[01/16/2024-14:50:47] [W] [TRT] Cache result detected as invalid for node: /block0/convblock/convblock.7/conv/Conv + block0.convblock.7.beta + /block0/convblock/convblock.7/Mul + /block0/convblock/convblock.7/Add + PWN(/block0/convblock/convblock.7/relu/LeakyRelu), LayerImpl: CaskConvolution, tactic: 0xecff17b04e8a0aaf
[01/16/2024-14:50:49] [W] [TRT] Cache result detected as invalid for node: /block1/convblock/convblock.1/conv/Conv + block1.convblock.1.beta + /block1/convblock/convblock.1/Mul + /block1/convblock/convblock.1/Add + PWN(/block1/convblock/convblock.1/relu/LeakyRelu), LayerImpl: CaskConvolution, tactic: 0xad01c782980ed345
[01/16/2024-14:50:50] [W] [TRT] Cache result detected as invalid for node: /block1/convblock/convblock.2/conv/Conv + block1.convblock.2.beta + /block1/convblock/convblock.2/Mul + /block1/convblock/convblock.2/Add + PWN(/block1/convblock/convblock.2/relu/LeakyRelu), LayerImpl: CaskConvolution, tactic: 0xad01c782980ed345
[01/16/2024-14:50:50] [W] [TRT] Cache result detected as invalid for node: /block1/convblock/convblock.3/conv/Conv + block1.convblock.3.beta + /block1/convblock/convblock.3/Mul + /block1/convblock/convblock.3/Add + PWN(/block1/convblock/convblock.3/relu/LeakyRelu), LayerImpl: CaskConvolution, tactic: 0xad01c782980ed345
[01/16/2024-14:50:50] [W] [TRT] Cache result detected as invalid for node: /block1/convblock/convblock.4/conv/Conv + block1.convblock.4.beta + /block1/convblock/convblock.4/Mul + /block1/convblock/convblock.4/Add + PWN(/block1/convblock/convblock.4/relu/LeakyRelu), LayerImpl: CaskConvolution, tactic: 0xad01c782980ed345
[01/16/2024-14:50:50] [W] [TRT] Cache result detected as invalid for node: /block1/convblock/convblock.5/conv/Conv + block1.convblock.5.beta + /block1/convblock/convblock.5/Mul + /block1/convblock/convblock.5/Add + PWN(/block1/convblock/convblock.5/relu/LeakyRelu), LayerImpl: CaskConvolution, tactic: 0xad01c782980ed345
[01/16/2024-14:50:50] [W] [TRT] Cache result detected as invalid for node: /block1/convblock/convblock.6/conv/Conv + block1.convblock.6.beta + /block1/convblock/convblock.6/Mul + /block1/convblock/convblock.6/Add + PWN(/block1/convblock/convblock.6/relu/LeakyRelu), LayerImpl: CaskConvolution, tactic: 0xad01c782980ed345
[01/16/2024-14:50:50] [W] [TRT] Cache result detected as invalid for node: /block1/convblock/convblock.7/conv/Conv + block1.convblock.7.beta + /block1/convblock/convblock.7/Mul + /block1/convblock/convblock.7/Add + PWN(/block1/convblock/convblock.7/relu/LeakyRelu), LayerImpl: CaskConvolution, tactic: 0xad01c782980ed345
[01/16/2024-14:50:53] [W] [TRT] Cache result detected as invalid for node: /block2/convblock/convblock.1/conv/Conv + block2.convblock.1.beta + /block2/convblock/convblock.1/Mul + /block2/convblock/convblock.1/Add + PWN(/block2/convblock/convblock.1/relu/LeakyRelu), LayerImpl: CaskConvolution, tactic: 0xcd1874fa76e36ecf
[01/16/2024-14:50:53] [W] [TRT] Cache result detected as invalid for node: /block2/convblock/convblock.2/conv/Conv + block2.convblock.2.beta + /block2/convblock/convblock.2/Mul + /block2/convblock/convblock.2/Add + PWN(/block2/convblock/convblock.2/relu/LeakyRelu), LayerImpl: CaskConvolution, tactic: 0xcd1874fa76e36ecf
[01/16/2024-14:50:54] [W] [TRT] Cache result detected as invalid for node: /block2/convblock/convblock.3/conv/Conv + block2.convblock.3.beta + /block2/convblock/convblock.3/Mul + /block2/convblock/convblock.3/Add + PWN(/block2/convblock/convblock.3/relu/LeakyRelu), LayerImpl: CaskConvolution, tactic: 0xcd1874fa76e36ecf
[01/16/2024-14:50:54] [W] [TRT] Cache result detected as invalid for node: /block2/convblock/convblock.4/conv/Conv + block2.convblock.4.beta + /block2/convblock/convblock.4/Mul + /block2/convblock/convblock.4/Add + PWN(/block2/convblock/convblock.4/relu/LeakyRelu), LayerImpl: CaskConvolution, tactic: 0xcd1874fa76e36ecf
[01/16/2024-14:50:55] [W] [TRT] Cache result detected as invalid for node: /block2/convblock/convblock.5/conv/Conv + block2.convblock.5.beta + /block2/convblock/convblock.5/Mul + /block2/convblock/convblock.5/Add + PWN(/block2/convblock/convblock.5/relu/LeakyRelu), LayerImpl: CaskConvolution, tactic: 0xcd1874fa76e36ecf
[01/16/2024-14:50:55] [W] [TRT] Cache result detected as invalid for node: /block2/convblock/convblock.6/conv/Conv + block2.convblock.6.beta + /block2/convblock/convblock.6/Mul + /block2/convblock/convblock.6/Add + PWN(/block2/convblock/convblock.6/relu/LeakyRelu), LayerImpl: CaskConvolution, tactic: 0xcd1874fa76e36ecf
[01/16/2024-14:50:55] [W] [TRT] Cache result detected as invalid for node: /block2/convblock/convblock.7/conv/Conv + block2.convblock.7.beta + /block2/convblock/convblock.7/Mul + /block2/convblock/convblock.7/Add + PWN(/block2/convblock/convblock.7/relu/LeakyRelu), LayerImpl: CaskConvolution, tactic: 0xcd1874fa76e36ecf
[01/16/2024-14:50:58] [W] [TRT] Cache result detected as invalid for node: /block3/convblock/convblock.1/conv/Conv + block3.convblock.1.beta + /block3/convblock/convblock.1/Mul + /block3/convblock/convblock.1/Add + PWN(/block3/convblock/convblock.1/relu/LeakyRelu), LayerImpl: CaskConvolution, tactic: 0xecff17b04e8a0aae
[01/16/2024-14:50:58] [W] [TRT] Cache result detected as invalid for node: /block3/convblock/convblock.2/conv/Conv + block3.convblock.2.beta + /block3/convblock/convblock.2/Mul + /block3/convblock/convblock.2/Add + PWN(/block3/convblock/convblock.2/relu/LeakyRelu), LayerImpl: CaskConvolution, tactic: 0xecff17b04e8a0aae
[01/16/2024-14:50:59] [W] [TRT] Cache result detected as invalid for node: /block3/convblock/convblock.3/conv/Conv + block3.convblock.3.beta + /block3/convblock/convblock.3/Mul + /block3/convblock/convblock.3/Add + PWN(/block3/convblock/convblock.3/relu/LeakyRelu), LayerImpl: CaskConvolution, tactic: 0xecff17b04e8a0aae

v2.3 missing?

The link to download older models is dead it seems.. Anyone know where to find version 2.3 now?

Heavy distortion after cuts

I'm using model 4.1, on this video, downloaded in what youtube-dl/yt-dlp identifies as format number 248 (the default if you exclude resolutions above 1080p).

There is a "cut" between scenes/shots at 15 frames past the 11-second mark in the input video. In the output video (with default settings), this turns into a downright disorienting transition. Looking frame by frame, the first two frames after the cut are heavily distorted.

Here are those two frames, plus the next two for context:

The culprit is the if block starting at line 227:

if ssim > 0.996:
    frame = read_buffer.get() # read a new frame
    if frame is None:
        break_flag = True
        frame = lastframe
    else:
        temp = frame
    I1 = torch.from_numpy(np.transpose(frame, (2,0,1))).to(device, non_blocking=True).unsqueeze(0).float() / 255.
    I1 = pad_image(I1)
    I1 = model.inference(I0, I1, args.scale)
    I1_small = F.interpolate(I1, (32, 32), mode='bilinear', align_corners=False)
    ssim = ssim_matlab(I0_small[:, :3], I1_small[:, :3])
    frame = (I1[0] * 255).byte().cpu().numpy().transpose(1, 2, 0)[:h, :w]

If I change the initial threshold from 0.996 to 1 (which I think disables the block because ssim can never be >1), the issue disappears:

Rife v4.13 vs v4.12

What is difference between Rife v4.12 and v4.13?

Found bugs in inference_img.py

inference_img.py - line 99

with bugs:

res.append(model.inference(img0, img1, (i+1) * 1. / (n+1), args.scale))

fixed:

img_list.append(model.inference(img0, img1, (i+1) / n))

There's no res variable, so I assumed it should be img_list
Formula was incorrect, resulting in a 0.333 ratio instead of 0.5 for x2 interpolation
There's no args.scale argument, so I removed it, but you can add it in the list of arguments to keep it

Changelog for model versions

Hello RIFE authors! I often find myself checking out this repository in anticipation of new checkpoint releases. So far, I had a bit of trouble understanding what each version is supposed to improve upon. I could not find any changelog for this, would it be possible to include one somewhere? If one already exists, could you kindly point me to it?

Thank you so much, keep up the great work!

Why was ensemble removed for models 4.7+?

Hey.
Actually I found ensemble to be very useful for fighting pattern artifacts (fences, carpet patterns, etc).
Besides using higher scales.

Can the ensemble code just be re-added to IFNet_HDv3.py or is there more to it to make ensemble work again?

what‘s the mean of ensemble?

if ensemble:
f1, m1 = block[i](torch.cat((img1[:, :3], img0[:, :3], 1-timestep), 1), None, scale=scale_list[i])
flow = (flow + torch.cat((f1[:, 2:4], f1[:, :2]), 1)) / 2
mask = (mask + (-m1)) / 2

can explain the ensemble effect? thanks very much

Add 10-bit video support

10-bit video content is becoming more and more common with a lot of modern TVs, displays, and cameras (mobile phones or dedicated) being capable of recording or playing 10 bit video files.
Popular standards like HDR10 have 10 bit video as a requirement.

Currently, RIFE doesn't support source frames with more than 8 bit pixel depth per component.
It would be very useful if RIFE could read and write 10-bit frames without losing color information during the interpolation.
Having 12/14/16-bit support for the future may also be useful, but supporting at least 10-bit content would be a big improvement, since such content is very common already.

rgb48le is one of the suitable pixel formats for the internal processing to support up to 16 bits (x3 components).
But when Practical-RIFE receives such rgb48le frames instead of rgb24 frames from the ffmpeg-based reader, the pixel data gets truncated and corrupted back to the 8-bit format while going through the inference logic.

Would it be possible to make the inference logic compatible with 10+ bit content?

Thank you for your work on RIFE!

Not workin in colab and on local machine

Says some crap about siize mismatch, it does not interpolate , in colab it should not have any errors , you just run it and it works, but it doesnt.Maybe it wasnt checked in a while and is broken for a long time.this is local macgine error :
size mismatch for block2.conv0.0.0.weight: copying a param with shape torch.size

blah blah.
Why is this happening ?

Dataset for training

Hi, I tried to retrain v4.6 model with Vimeo90K dataset. However, the trained model does not perform as well as the provided model for test images. Is 4.6 still using the Vimeo dataset? Or are you using other datasets as well?

(Fixed in Rife 4.12) Rife 4.11 minor artefact regression

Hi
Thank you for Rife 4.11. I am always happy a new model version is released.

I found a slight regression for artefacts in one of my test files. There are a few more minor artefacts than in 4.9, very similar to Rife 4.6. But besides that it looks like it will behave similarly to Rife 4.10 for GPU usage.

关于游戏插帧的UI扭曲问题

你好，我们在测试RIFEV4.13的时候发现相比RIFEV3.6版本，在游戏插帧场景，UI随背景的带动扭曲问题改善了很多，请问对于UI扭曲移动问题，有什么建议可以改善嘛？多谢~

ONNX-model

Thanks for your excellent work!
May I ask if you prepare ONNX model?

Artefacts described in this ticket are improved in SVP with Rife v4.15 but still present

When using SVP, Rife tends to have less artefacts than the other mode. I notice that Rife 4.7 has even less artefacts than 4.6. But the problem is that the artefacts that remain tend to be worse. The biggest one I've seen recently is a character appearing and disappering every other frame. Is there any use for me to point out the location of these artefacts ? Is there anyone I should send them to?

训练问题

作者您好，请问您现在最新4.13版本有什么训练策略变化吗，现在是用什么损失训练的呢

Training details of RIFE v4.7

Thank you for share the admirable work.

Could you give more details for training the network. Only VGG and L1 losses are used on each Block liking V4.6 or more strategies are used?

Looking forward to your reply, thank you!

Torchvision is required

Upon running it for the first time, I got

Please download our model from model list
Traceback (most recent call last):
  File "[snip]/Practical-RIFE/inference_video.py", line 96, in <module>
    model = Model()
NameError: name 'Model' is not defined

I had installed a model. When I ran the line in IDLE:

>>> from train_log.RIFE_HDv3 import Model
    Traceback (most recent call last):
      File "/usr/lib64/python3.10/idlelib/run.py", line 578, in runcode
        exec(code, self.locals)
      File "<pyshell#2>", line 1, in <module>
      File "[snip]/repos/Practical-RIFE/train_log/RIFE_HDv3.py", line 11, in <module>
        from model.loss import *
      File "[snip]/repos/Practical-RIFE/model/loss.py", line 5, in <module>
        import torchvision.models as models
    ModuleNotFoundError: No module named 'torchvision'
>>>

I ran pip install torchvision and tried again, and it worked.
I'm guessing this means that torchvision should be added to requirements.txt.

TYPO Error in IFNet_HDv3.py v4.7

Hi!
At the line 80 you have
)^M

And python give the error "name 'M' is not defined"

Model v3 update log

Starting from the v3 model, the .py files will be packaged together in a compressed package for developers to change the code.

请不要在本帖下回复
Please do not reply under this post

video glitch/distorted at the beginning

I used model 4.0, scale = 4, exp = 4, practical-rife

RIFE 4.0 causes tensor size mismatch errors for some resolutions

Interpolating 720p or 1440p video with RIFE 4.0 throws an error after some frames:

File "D:\Code\GitHub\flowframes\Code\bin\x64\Release\FlowframesData\pkgs\rife-cuda\arch\RIFE_HDv3.py", line 59, in inference
flow, mask, merged = self.flownet(imgs, timestep, scale_list)
File "D:\Software\Python38\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "D:\Code\GitHub\flowframes\Code\bin\x64\Release\FlowframesData\pkgs\rife-cuda\arch\IFNet_HDv3.py", line 99, in forward
f0, m0 = block[i](torch.cat((warped_img0[:, :3], warped_img1[:, :3], timestep, mask), 1), flow, scale=scale_list[i])
RuntimeError: Sizes of tensors must match except in dimension 2. Got 768 and 736 (The offending index is 2)

1080p works fine, it seems to be some scaling/cropping issue?

I can reliably reproduce this problem on 4.0, but not on 3.9, so it seems to be caused by IFNet_HDv3.py changes.

[BUG] Acceleration, instead of interpolation

When testing version 4, a bug was found, because of which the video stupidly accelerates, instead of an upscale. That is, it is improving, but in some moments it stupidly accelerate