mcg-nku / e2fgvi Goto Github PK

View Code? Open in Web Editor NEW

994.0 16.0 96.0 36.71 MB

Official code for "Towards An End-to-End Framework for Flow-Guided Video Inpainting" (CVPR2022)

License: Other

Python 99.74% Shell 0.26%

inpainting optical-flow cvpr2022 video-inpainting object-removal

e2fgvi's People

Contributors

Stargazers

Watchers

Forkers

peterzhousz jinwook-shim feiward cv-ip ndkamath feilisp4 lgyoung nk-cs-zzl ishine luke1879012 leetesla janfschr ericustc peterzs as85207 aryastarksakura dreamos dragon-s ishrat-tl xiaojake troceleng nano1337 bage2017 wangming1993 outfielder poem4love kiviyi alterise kalambur4k nazar96 wegatron ip-restoration johnayers84 anthonyyuan miravideo masterhow shuowang-ai franshej bzy-ai ronnielige johnsonzxchang 1144181135 rover5056 liuhaochuan79 nhatthanh1 enghokie zebrajack anilkunchalaece teravus fridayjk nelsontseng0704 ccssnn dengjl-hub fanstering fan-treasure xiaoyao-li pandatimo lownycgi backviet sunsetmkt gauenk hufeihu zhangaocanada qcymkxyc alimostafaradwan liu4lin weiguangfan kippapollo rendicahya magicse osmaras liwenjin0518 haoyaogang woody0105 4noha khalee2307 iitealpha jiafei1224 paulasquin note-liu camenduru xymfei tony163163 monup165 w4230213 simplexsigil hollobit xinqiyang gofullthrottle mcropper14 catcakecandle rafaelperez jt-g3601 soon14 serhiipostupaiev

e2fgvi's Issues

About Training

您好呀，最近在阅读此模型的论文及代码调试
我有两个问题想咨询一下您

有关数据集，以下是我的路径

在我输入命令: :/data/team10/cai/E2FGV$ sh datasets/zip_dir.sh
出现文件不存在:[./datasets/davis/JPEGImages] is not exist. Please check the directory.
Done!
我想知道我的路径这样是出错了嘛~
2.第二个，在训练期间无法获取GPU，（服务器上有6张显卡，我最近一次调用有2张剩余）

config['world_size'] = get_world_size()获取的数量为0

About the pretrained model of discriminator and opt.pth

Hello, I'm very lucky and happy to know your work. What a fantastic work!
I am doing some research which also contains video inpainting. I'd like to finetune your pretrained model on my new dataset. However, I could only find the generator model in the link given in README.md. Could you please upload the discriminator model as well (also the opt.pth)? Or could you please tell me how to get access to it in case I missed the downloading link?
Thank you very much!

Environment configuration

Hi, I tried to modify the environment for training to resolve the problem incurred by DCN v2 as mentioned in #6 but things did not work out. I could not use your original configuration, which I haven't figured out why. Probably due to driver version of CUDA or something else. I tried to resume from the ckpt before loss curve goes high but did not work. I would like to follow your original configuration. Can you give some information about the system and driver version of your CUDA? Thanks a lot.

Qt platform reports an error. Where is pyqt5 used in the code?

Error: This application failed to start because no Qt platform plugin could be initialized.

Error reported in training <IndexError: list index out of range>

作者您好！最近在阅读您的这篇文章及尝试调试代码。我有一个问题想咨询您。
在使用youtube-vos数据集来训练e2fgvi模型时，出现了以下问题。

索引越界了。
查看 datasets/youtube-vos/train.json 这个文件，猜测是“数据编号:帧数量”的意思，例如 "003234408d": 180 的意思是youtube-vos数据集里面编号为003234408d的数据一共有180帧。可是 datasets/youtube-vos/JPEGImages 这个文件夹里面并没有编号为003234408d的数据，因此我猜测可能是我下载的数据集出错了。但是我是按照着Prepare dataset for training and evaluation的指引下载了youtube-vos2018（或者Google Drive）的train.zip和test_all_frames.zip这两个文件并解压，mask也用的是指引提供的。
是因为我弄错了数据集吗？

关于FLOPs计算的代码

作者您好，
我也一直在关注这方面的工作。由于注意力计算的存在，STTN和Fusefomer等工作的FLOPs难以用现有的模型统计插件计算，请问作者是否可以提供一些您在报告FLOPs指标时使用的代码或者计算方法，以方便我们更好的跟进您的工作。谢谢

VRAM limitations, static mask, output?

Hello. Thank you for publishing your new resolution-unlocked model. I have a couple issues to report, though.

I have 8 GB of VRAM to use, but get OOM with just 500 frames of 144x120 size video. Is your software trying to load all of them into VRAM at once? After all, 200 frames works. Could I set a cap on the amount of frames loaded onto VRAM at once, so that it doesn't return OOM? I'm trying to process an input that isn't a GIF.
When I have a non-moving mask, could I just point to one PNG? It's an extra step of work to duplicate the mask into thousands of files.
Are you planning on making the output settings more advanced, for example by using FFmpeg output instead of cv2 video writer? Even without the distortion from resizing to 432x240, the outputs aren't the quality I'd like them to be.

训练配置及用时

作者太强了，不仅效果好还把开源做的这么棒，咨询一下您训练的时候用了几张卡，跑完500k iteration要花费多长时间呢。

How to tune the parameters for high resolution video and low memory GPU

Firstly, this work is awesome. I'm trying this for a 640*480 video on v100 GPU, and I met the OOM issue. I've successfuly got a reasonable result after resizing the video to a very small resolution, but that's not what I want. So I wonder is there any way to fit the 640p video to my v100 GPU by tuning the parameters, thanks.

add web demo/model to Huggingface

Hi, would you be interested in adding E2FGVI to Hugging Face? The Hub offers free hosting, and it would make your work more accessible and visible to the rest of the ML community. Models/datasets/spaces(web demos) can be added to a user account or organization similar to github.

Example from other organizations:
Keras: https://huggingface.co/keras-io
Microsoft: https://huggingface.co/microsoft
Facebook: https://huggingface.co/facebook

Example spaces with repos:
github: https://github.com/salesforce/BLIP
Spaces: https://huggingface.co/spaces/salesforce/BLIP

github: https://github.com/facebookresearch/omnivore
Spaces: https://huggingface.co/spaces/akhaliq/omnivore

and here are guides for adding spaces/models/datasets to your org

How to add a Space: https://huggingface.co/blog/gradio-spaces
how to add models: https://huggingface.co/docs/hub/adding-a-model
uploading a dataset: https://huggingface.co/docs/datasets/upload_dataset.html

Please let us know if you would be interested and if you have any questions, we can also help with the technical implementation.

Object masks generation for custom recordings

Amazing paper and results, thanks for this work! I can't wait to see future updates described in Work in Progress section!
I'm interested in testing your method for object removal task on my custom videos outside of popular benchmarks. I was wondering if you could recommend any method for producing these object masks - hopefully generating one mask per object in video?

How to draw the table 1 in the paper?

Hello,

I'm curious about how to code the table 1.

Could you provide the source code?

Thank you very much for your generous helps.

数据归一化的问题

请问训练集数据归一化到[-1,1]之间，您的代码将遮挡区域设置为0，请问遮挡区域为什么不是设置成-1，是否是因为这对模型结果没有影响

Error in backward BidirectionalPropagation

When 'backward' propagation is applied, flow_idx was forgotten to be reversed as well here:

https://github.com/MCG-NKU/E2FGVI/blob/master/model/modules/feat_prop.py#L101

Code was copied inaccurate from BasicVSR++, please refer to:
https://github.com/ckkelvinchan/BasicVSR_PlusPlus/blob/master/mmedit/models/backbones/sr_backbones/basicvsr_pp.py#L180

About the

作者您好，您的这片工作非常精彩，效果也很棒！
我有一个关于代码的问题，您上传的代码models/modules/feat_prop.py里面，我对比了一下basicvsr++的源码，感觉在backward_propagation的时候，得到cond_n1所用的光流是不是有问题，您写的for循环frame_idx和flow_idx应该保持顺序一致？我看basicvsr++是这样的，想询问一下

RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED occurred when test.py was run. The environment was installed according to the issuse#3.The specific environment is as follows. May I ask what is the reason for this problem in the current operation? And how to fix it

Why the flow values are not normalized to [-1, 1] ?

Thank you for your answer

How to convert the pytorch model to the onnx model?

How to convert the pytorch model to the onnx model? I tried the conversion process, but I reported an error. I don't know what the problem is. I'm Xiaobai. Thank you for your advice.My script as follows:
import torch
import importlib

device = torch.device("cpu")
model = "e2fgvi_hq"

ckpt = 'release_model/E2FGVI-HQ-CVPR22.pth'

net = importlib.import_module('model.' + model)
model = net.InpaintGenerator().to(device)
data = torch.load(ckpt, map_location=device)
model.load_state_dict(data)
print(f'Loading model from: {ckpt}')
model.eval()
x = torch.randn(1,1, 3, 240, 864, requires_grad=True)
torch.onnx.export(model, # model being run
(x,2), # model input (or a tuple for multiple inputs)
"E2FGVI-HQ-CVPR22.onnx", # where to save the model (can be a file or file-like object)
export_params=True, # store the trained parameter weights inside the model file
opset_version=16, # the ONNX version to export the model to
do_constant_folding=True, # whether to execute constant folding for optimization
input_names = ['input'], # the model's input names
output_names = ['output'], # the model's output names
dynamic_axes={'input' : {1 : 'batch_size'}})
the error as follows:
torch.onnx.symbolic_registry.UnsupportedOperatorError: Exporting the operator ::col2im to ONNX opset version 16 is not supported. Please feel free to request support or submit a pull request on PyTorch GitHub.

Request a suggestion for model distillation

This model is great, but the calculation speed is a bit slow, I want to try to distill this model, can you give some advice? Such as which layers can be reduced or removed

Solving environment: failed

I try installing on both windows and linux however I get Solving environment: failed

conda env create -f environment.yml
Collecting package metadata (repodata.json): done
Solving environment: failed

ResolvePackageNotFound:

libffi==3.3=he6710b0_2
lcms2==2.12=h3be6417_0
matplotlib-base==3.4.2=py37hab158f2_0
tornado==6.1=py37h27cfd23_0
brotli==1.0.9=he6710b0_2
scipy==1.6.2=py37had2a1c9_1
bzip2==1.0.8=h7b6447c_0
locket==0.2.1=py37h06a4308_1
libpng==1.6.37=hbc83047_0
ffmpeg==4.2.2=h20bf706_0
freetype==2.10.4=h5ab3b9f_0
expat==2.4.1=h2531618_2
xz==5.2.5=h7b6447c_0
ncurses==6.2=he6710b0_1
openh264==2.1.0=hd408876_0
qt==5.9.7=h5867ecd_1pt150_0
pywavelets==1.1.1=py37h7b6447c_2
libgfortran-ng==7.5.0=ha8ba4b0_17
libwebp-base==1.2.0=h27cfd23_0
pcre==8.45=h295c915_0
jpeg==9d=h7f8727e_0
ca-certificates==2022.2.1=h06a4308_0
certifi==2021.10.8=py37h06a4308_2
gstreamer==1.14.0=h28cd5cc_2
lame==3.100=h7b6447c_0
libtiff==4.2.0=h85742a9_0
tk==8.6.11=h1ccaba5_0
glib==2.69.1=h5202010_0
pillow==8.3.1=py37h2c7a002_0
libgcc-ng==9.3.0=h5101ec6_17
openssl==1.1.1m=h7f8727e_0
libstdcxx-ng==9.3.0=hd4cf53a_17
fontconfig==2.13.1=h6c09931_0
zstd==1.4.9=haebb681_0
zlib==1.2.11=h7b6447c_3
_openmp_mutex==4.5=1_gnu
pyqt==5.9.2=py37h05f1152_2
libvpx==1.7.0=h439df22_0
libgomp==9.3.0=h5101ec6_17
python==3.7.11=h12debd9_0
dbus==1.13.18=hb2f20db_0
x264==1!157.20191217=h7b6447c_0
openjpeg==2.4.0=h3ad879b_0
libtasn1==4.16.0=h27cfd23_0
lz4-c==1.9.3=h295c915_1
cytoolz==0.11.0=py37h7b6447c_0
mkl_fft==1.3.0=py37h42c9631_2
sqlite==3.36.0=hc218d9a_0
gnutls==3.6.15=he1e5248_0
icu==58.2=he6710b0_3
pytorch==1.5.1=py3.7_cuda9.2.148_cudnn7.6.3_0
libgfortran4==7.5.0=ha8ba4b0_17
yaml==0.2.5=h7b6447c_0
ninja==1.10.2=hff7bd54_1
nettle==3.7.3=hbbd107a_1
kiwisolver==1.3.1=py37h2531618_0
setuptools==58.0.4=py37h06a4308_0
libopus==1.3.1=h7b6447c_0
libunistring==0.9.10=h27cfd23_0
matplotlib==3.4.2=py37h06a4308_0
sip==4.19.8=py37hf484d3e_0
gmp==6.2.1=h2531618_2
pip==21.2.2=py37h06a4308_0
numpy-base==1.20.3=py37h74d4b33_0
libidn2==2.3.2=h7f8727e_0
pyyaml==5.4.1=py37h27cfd23_1
libxcb==1.14=h7b6447c_0
gst-plugins-base==1.14.0=h8213a91_2
ld_impl_linux-64==2.35.1=h7274673_9
mkl-service==2.4.0=py37h7f8727e_0
libuuid==1.0.3=h7f8727e_2
mkl_random==1.2.2=py37h51133e4_0
mkl==2021.3.0=h06a4308_520
libxml2==2.9.12=h03d6c58_0
intel-openmp==2021.3.0=h06a4308_3350
numpy==1.20.3=py37hf144106_0

Demo videos to contribute

Hi,

Thanks for this great repo and project.

Not really an issue, more a question:
I see the demo video section is TBD, would you be interested by some inferenced test videos in the wild for the read me?
I am planning to run some anyway, hopefully in the next week or so, let me know and I ll share.

Would be great to have higher res trained model to produce better quality demo videos too, but I see it is on the book of work.

RuntimeError: modulated_deformable_im2col_impl: implementation for device cuda:0 not found

I am getting the following error when I try to run the test script on the following docker image. Am I missing something?

FROM nvidia/cuda:11.6.0-base-ubuntu18.04
LABEL maintainer="Ayush Saraf"
ARG CONDA_PYTHON_VERSION=3
ARG CONDA_DIR=/opt/conda
ARG USERNAME=docker
ARG USERID=1000

# Instal basic utilities
RUN apt-get update && \
    apt-get install -y --no-install-recommends git wget unzip bzip2 sudo build-essential ca-certificates ffmpeg libsm6 libxext6 && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*
# Install miniconda
ENV PATH $CONDA_DIR/bin:$PATH
RUN wget --quiet \
    https://repo.continuum.io/miniconda/Miniconda$CONDA_PYTHON_VERSION-latest-Linux-x86_64.sh && \
    echo 'export PATH=$CONDA_DIR/bin:$PATH' > /etc/profile.d/conda.sh && \
    /bin/bash Miniconda3-latest-Linux-x86_64.sh -b -p $CONDA_DIR && \
    rm -rf /tmp/*
# Create the user
RUN useradd --create-home -s /bin/bash --no-user-group -u $USERID $USERNAME && \
    chown $USERNAME $CONDA_DIR -R && \
    adduser $USERNAME sudo && \
    echo "$USERNAME ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers
USER $USERNAME
WORKDIR /home/$USERNAME

# Conda env
RUN wget https://raw.githubusercontent.com/MCG-NKU/E2FGVI/master/environment.yml
RUN conda env create -f environment.yml

GPU is not working for prediction

Hi, I meet a problem when I was predicting using the E2FGVI-HQ. My CPU and Memory working for whole time but GPU does not work at all. I have ensure my CUDA is installed successfully, and the device for this code is return cuda.
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

About custom datasets

Hello, very lucky to learn about your model.
I was able to successfully train the davis dataset, but there are some issues with defining the dataset.
There are 320 zip files in JPEGImages, each zip has ten photos.
There are 320 normal mask files in test_masks, each with ten mask photos.
test.json is the same as train.json.

But when we run our own file, the following error occurs：is invalid for input of size 11272192

Custom dataset directory：
dataset
——ballet
————JPEGImages
—————— xxx.zip
—————— .........
———test_masks
—————— xxx
————train.json
————test.json

specific error：
Traceback (most recent call last):
File "/home/u202080087/data/E2FGVI/train.py", line 84, in
mp.spawn(main_worker, nprocs=config['world_size'], args=(config, ))
File "/home/u202080087/.conda/envs/e2f/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 199, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/home/u202080087/.conda/envs/e2f/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 157, in start_processes
while not context.join():
File "/home/u202080087/.conda/envs/e2f/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 118, in join
raise Exception(msg)
Exception:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/home/u202080087/.conda/envs/e2f/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
fn(i, *args)
File "/home/u202080087/data/E2FGVI/train.py", line 64, in main_worker
trainer.train()
File "/home/u202080087/data/E2FGVI/core/trainer.py", line 288, in train
self._train_epoch(pbar)
File "/home/u202080087/data/E2FGVI/core/trainer.py", line 307, in _train_epoch
pred_imgs, pred_flows = self.netG(masked_frames, l_t)
File "/home/u202080087/.conda/envs/e2f/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/u202080087/.conda/envs/e2f/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 619, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/u202080087/.conda/envs/e2f/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/u202080087/data/E2FGVI/model/e2fgvi_hq.py", line 255, in forward
trans_feat = self.transformer([trans_feat, fold_output_size])
File "/home/u202080087/.conda/envs/e2f/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/u202080087/.conda/envs/e2f/lib/python3.7/site-packages/torch/nn/modules/container.py", line 117, in forward
input = module(input)
File "/home/u202080087/.conda/envs/e2f/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/u202080087/data/E2FGVI/model/modules/tfocal_transformer_hq.py", line 551, in forward
attn_windows = self.attn(x_windows_all, mask_all=x_window_masks_all)
File "/home/u202080087/.conda/envs/e2f/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/u202080087/data/E2FGVI/model/modules/tfocal_transformer_hq.py", line 252, in forward
0] * self.window_size[1], C // self.num_heads), (q, k, v))
File "/home/u202080087/data/E2FGVI/model/modules/tfocal_transformer_hq.py", line 248, in
lambda t: window_partition(t, self.window_size).view(
File "/home/u202080087/data/E2FGVI/model/modules/tfocal_transformer_hq.py", line 132, in window_partition
window_size[1], C)
RuntimeError: shape '[2, 8, 6, 5, 4, 9, 512]' is invalid for input of size 11272192

Looking forward to your reply

flow: 0.045; d: 0.999; hole: 0.430; valid: 0.283: 0%| | 234/500000

为什么每次训练到234这里就报错，报torch.multiprocessing.spawn.ProcessExitedException: process 2 terminated with exit code 1 这样的错误

Question about learning rate

你好，感谢您的工作。我有一个关于学习率的问题。我注意到您文章中写到 initial learning rate is 0.0001，reduce at 400k by factor of 10
但在对比工作fuseformer中initial learning rate is 0.01，之后分别在200k，400k和450k时reduce by factor of 10
您是否测试过这二者的区别？是什么让您选择没有follow fuseformer的配置呢？
希望得到您的解答！！！

Not able to reproduce the results listed in the paper with my trained model

I met a problem of mode collapse when step number is larger than 300K, and with the final model I got, I am not able to reproduce the result shown int the paper. Can you give you loss curve? @MingMingCheng

Can replace ModulatedDeformConv2dFunction with another function? I'm having trouble converting to another format to save as torch.jit.save.

Looks like it's caused by not being able to export ModulatedDeformConv2dFunction

Traceback (most recent call last):
  File "test6.py", line 370, in <module>
    main_worker()
  File "test6.py", line 294, in main_worker
    traced_model.save("traced_model3.pt")
  File "/Users/mac/opt/anaconda3/envs/e2fgvi36/lib/python3.6/site-packages/torch/jit/_script.py", line 487, in save
    return self._c.save(*args, **kwargs)
RuntimeError: 
Could not export Python function call 'ModulatedDeformConv2dFunction'. Remove calls to Python functions before export. Did you forget to add @script or @script_method annotation? If this is a nn.ModuleList, add it to __constants__:

Save code add like this place,

...
with torch.no_grad():
         masked_imgs = selected_imgs * (1 - selected_masks)
         mod_size_h = 60
         mod_size_w = 108
         h_pad = (mod_size_h - h % mod_size_h) % mod_size_h
         w_pad = (mod_size_w - w % mod_size_w) % mod_size_w
         masked_imgs = torch.cat(
             [masked_imgs, torch.flip(masked_imgs, [3])],
             3)[:, :, :, :h + h_pad, :]
         masked_imgs = torch.cat(
             [masked_imgs, torch.flip(masked_imgs, [4])],
             4)[:, :, :, :, :w + w_pad]
     
         ids=torch.randint(10,(1,)) 
         print(ids.shape) 
         ids[0] =len(neighbor_ids)
         print(ids.item()) 
         # Jit.trace seems all values with tensor. so change the second integer  with tensor. And inpaint forward, use
         # l_t = num_local_frames.item()
         traced_model = torch.jit.trace(model, (masked_imgs,ids) ) 
         
         #torch.save(traced_model,"traced_model2.pt")
         traced_model.save("traced_model3.pt")
         exit()

同样的CVPR，这个作者的源码运行起来真的没什么难度，必须给赞，有的作者cvpr发了，可是开源的让人费很大的劲才能运行，弄的乱七八糟

Not able to reproduce the results listed in the paper with my trained model

I met a problem of mode collapse when step number is larger than 300K, and with the final model I got, I am not able to reproduce the result shown int the paper. Can you give your loss curve? @Paper99

Request for the visualization codes

Hi, thanks for your wonderful work. I notice you visualize the feature maps of local frames' features, could you please provide it?

In supp material: To further investigate the effectiveness of the feature propagation module, we visualize averaged local neighboring features with the temporal size of 5 before conducting content hallucination in Fig. 10.

怎么优化GPU 内存？

作者您好，
非常优秀的算法！
我在尝试训练时遇到：
RuntimeError: CUDA out of memory. Tried to allocate 64.00 MiB (GPU 2; 11.93 GiB total capacity; 11.07 GiB already allocated; 56.12 MiB free; 11.43 GiB reserved in total by PyTorch)
我的机器有四个·GPU，把 "batch_size": 从8改为4后才工作。对于每个GPU来说batch_size 是 2.
如果我想要增加batch_size，不知道有没可能优化内存，您有什么建议吗？
多谢！

Output encoding settings

Hello. After a long while of trial and error, I managed to get this software running. It still doesn't run well, giving me OOM with more than 250 frames of 120x144 video. I have an 8GB 3060ti, which should be fine for this, in my opinion. Needing to split tasks many times is a pain, but might be manageable.

What isn't manageable are the output settings. H.263 is outdated and with tiny input sizes and lengths, lossy is a baffling pick. Maybe I missed a customizing option somewhere? I would like to have lossless h264 or FFV1. In addition, I would like to decide the video's framerate (very important for syncing) and not have the video resized. That causes distortions that look bad.

Thank you. Looking forward to the high-resolution model.

About the feat propagation function

Hello, I found that the return value of forward bidirect flow function is (forward, backward), but the input of feat propagation function is (backward, forward). It seems like you put the forward flow as backward flow into the propagation function. Is that your design or a mistake? Looking forward to your reply. :)

E2FGVI_HQ in Google Colab

I attempted to use E2FGVI_HQ in Google Colab, replacing the model/release model and any reference to e2fgvi to e2fgvi_hq. I made sure I was in the bounds of the CUDA limit and set the width and height to the source value.

When inpainting it returns the error: "IndexError: list index out of range"

Is there a way to get this to work properly?

Welcome update to OpenMMLab 2.0

I am Vansin, the technical operator of OpenMMLab. In September of last year, we announced the release of OpenMMLab 2.0 at the World Artificial Intelligence Conference in Shanghai. We invite you to upgrade your algorithm library to OpenMMLab 2.0 using MMEngine, which can be used for both research and commercial purposes. If you have any questions, please feel free to join us on the OpenMMLab Discord at https://discord.gg/amFNsyUBvm or add me on WeChat (van-sin) and I will invite you to the OpenMMLab WeChat group.

Here are the OpenMMLab 2.0 repos branches:

	OpenMMLab 1.0 branch	OpenMMLab 2.0 branch
MMEngine		0.x
MMCV	1.x	2.x
MMDetection	0.x 、1.x、2.x	3.x
MMAction2	0.x	1.x
MMClassification	0.x	1.x
MMSegmentation	0.x	1.x
MMDetection3D	0.x	1.x
MMEditing	0.x	1.x
MMPose	0.x	1.x
MMDeploy	0.x	1.x
MMTracking	0.x	1.x
MMOCR	0.x	1.x
MMRazor	0.x	1.x
MMSelfSup	0.x	1.x
MMRotate	1.x	1.x
MMYOLO		0.x

Attention: please create a new virtual environment for OpenMMLab 2.0.

frame_idx和flow_idx

[syujung] 在七月20号问了以下这个问题（Issue#25）
“作者您好，您的这片工作非常精彩，效果也很棒！
我有一个关于代码的问题，您上传的代码models/modules/feat_prop.py里面，我对比了一下basicvsr++的源码，感觉在backward_propagation的时候，得到cond_n1所用的光流是不是有问题，您写的for循环frame_idx和flow_idx应该保持顺序一致？我看basicvsr++是这样的，想询问一下”

您可以具体说一下应该怎样修改现在的代码？多谢！

光流扭曲误差计算问题

尊敬的作者，

感谢你们杰出的工作，我在计算E_warp时遇到了问题，README中推荐的仓库中，预训练模型的网页无法访问了(他们实验室网站的问题，并非翻墙等造成)，所以无法获得预训练模型。

请问你们可以上传光流误差计算的预训练模型吗？感谢！

: )

SPyNet模型问题

您好我想请问下为什么训练过程中调整过的SpyNet似乎没有保存下来，在测试过程中仍然使用预训练的Spynet

Model Performance on Medical Video Datasets

First off, your paper on an end-to-end framework for guided video completion is amazing! I used it on a medical task and it provided satisfactory results. However, it seems that the original iterative FGVC method performed better for my medical image-related task. Is there any way that I can use the pre-trained weights to fine-tune the model? If so, can you advise me as to which parts of the network I should freeze to preserve inpainting quality while allowing the network to familiarize itself with my domain-specific data?

I also only have access to one V100 GPU through Google Colab, so I'm unable to use distributed training. Any help would be greatly appreciated!

Question about the method of saving inference result

Hi, thanks for your contribution. I am very inspired by your work.
I have one question. Is there any specific reason to make the final result with the code comp_frames[idx] = comp_frames[idx].astype(np.float32) * 0.5 + img.astype(np.float32) * 0.5 ? Since this code will add the last made one with weighted value 0.5, and previous made result impacts are decreased by half, so the result will be biased to the result that made with end neighbor frame result. Is this a common way to make the video inpainting result?

Windows environment

Hey there, Just a FYI, this works for me in Windows with Cuda 11.1 and a 3090.

conda create -n e2fgvi python=3.7
conda activate e2fgvi
python -m pip install --upgrade pip
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts matplotlib==3.4.1
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts opencv-python==4.5.5.62
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts vispy==0.9.3
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts transforms3d==0.3.1
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts networkx==2.3
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts scikit-image==0.19.2
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts pyaml==21.10.1
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts moviepy==1.0.3
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts pyqt6==6.3.0
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.9/index.html
pip install tensorboard matplotlib

The original environment solve bug thread resulted in getting to the inference portion after an abnormally long time with no feedback.. and then it crashing saying that there was no kernel on the device.

After setting up the environment this way,
python test.py --model e2fgvi_hq --video examples/tennis --mask examples/tennis_mask --ckpt release_model/E2FGVI-HQ-CVPR22.pth
processed the test video in about 10 seconds.

About Speed Test~

作者您好，感谢您的精彩工作！我从中学到了许多 =)
请问速度的测试是在Davis数据集上以480x854的分辨率，计算一次forward时间并除以len(neighbor_ids)得到的吗？

Simplify google colab notebook

Hi.
Thanks for your colab notebook.
The cell named "Download model with PyDrive" can be simplified to built-in gdown tool from

to

Hi about the memory error

When I was trying to run my own video, it meet the problem of memory.

RuntimeError: CUDA out of memory. Tried to allocate 1.62 GiB (GPU 0; 8.00 GiB total capacity; 5.05 GiB already allocated; 0 bytes free; 7.01 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Frame number and video are small than the tennis or school girl demo. I am able to run those two demo, but not able to run my own demo.

Video result keeps blinking

I test the pretrained model, but my video result keeps blinking, is anything wrong?

Question of Focal Transformer

Hey, thanks for your wonderful work. I think it may be a bug:

E2FGVI/model/modules/tfocal_transformer.py

Line 342 in 924b56c

k_pooled_k = k_pooled_k.view(

should we first transpose and then do view operation?

Thanks in advance!

How to generate object-like masks

Hi authors,

Thanks for your awesome work!

I'm wondering if you used the same 'create_random_shape_with_random_motion' function for both video completion and object removal, if so, can I say this model has only been trained once for both tasks?

Besides, does this moving mask (https://github.com/MCG-NKU/E2FGVI/blob/master/core/utils.py#L209) refer to the object-like masks mentioned in your paper (experiment settings)?

about the loss

Hello, when I try to train your e2fgvi_hq model on the youtube-vos datasets, after 400k iterations the model will always collapse, and the flow loss will became nan. Have you ever face this problem? Looking forward to your answer!

Training datasets?

请问您训练时是只用youtube-vos？还是用了YouTube-vos and Davis? 您的源代码好像只用YouTube-vos.