drinkingcoder / flowformer-official Goto Github PK

License: Apache License 2.0

C++ 0.48% Cuda 3.57% Python 95.84% Shell 0.11%

flowformer-official's Introduction

FlowFormer: A Transformer Architecture for Optical Flow

Project Page

FlowFormer: A Transformer Architecture for Optical Flow
Zhaoyang Huang^*, Xiaoyu Shi^*, Chao Zhang, Qiang Wang, Ka Chun Cheung, Hongwei Qin, Jifeng Dai, Hongsheng Li
ECCV 2022

News

Our FlowFormer++ and VideoFlow are accepted by CVPR and ICCV, which ranks 2nd and 1st on the Sintel benchmark! Please also refer to our FlowFormer++ and VideoFlow.

TODO List

Code release (2022-8-1)
Models release (2022-8-1)

Data Preparation

Similar to RAFT, to evaluate/train FlowFormer, you will need to download the required datasets.

By default datasets.py will search for the datasets in these locations. You can create symbolic links to wherever the datasets were downloaded in the datasets folder

├── datasets
    ├── Sintel
        ├── test
        ├── training
    ├── KITTI
        ├── testing
        ├── training
        ├── devkit
    ├── FlyingChairs_release
        ├── data
    ├── FlyingThings3D
        ├── frames_cleanpass
        ├── frames_finalpass
        ├── optical_flow

Requirements

conda create --name flowformer
conda activate flowformer
conda install pytorch=1.6.0 torchvision=0.7.0 cudatoolkit=10.1 matplotlib tensorboard scipy opencv -c pytorch
pip install yacs loguru einops timm==0.4.12 imageio

Training

The script will load the config according to the training stage. The trained model will be saved in a directory in logs and checkpoints. For example, the following script will load the config configs/default.py. The trained model will be saved as logs/xxxx/final and checkpoints/chairs.pth.

python -u train_FlowFormer.py --name chairs --stage chairs --validation chairs

To finish the entire training schedule, you can run:

./run_train.sh

Models

We provide models trained in the four stages. The default path of the models for evaluation is:

├── checkpoints
    ├── chairs.pth
    ├── things.pth
    ├── sintel.pth
    ├── kitti.pth
    ├── flowformer-small.pth 
    ├── things_kitti.pth

flowformer-small.pth is a small version of our flowformer. things_kitti.pth is the FlowFormer# introduced in our supplementary, used for KITTI training set evaluation.

Evaluation

The model to be evaluated is assigned by the _CN.model in the config file.

Evaluating the model on the Sintel training set and the KITTI training set. The corresponding config file is configs/things_eval.py.

# with tiling technique
python evaluate_FlowFormer_tile.py --eval sintel_validation
python evaluate_FlowFormer_tile.py --eval kitti_validation --model checkpoints/things_kitti.pth
# without tiling technique
python evaluate_FlowFormer.py --dataset sintel

	with tile	w/o tile
clean	0.94	1.01
final	2.33	2.40

Evaluating the small version model. The corresponding config file is configs/small_things_eval.py.

# with tiling technique
python evaluate_FlowFormer_tile.py --eval sintel_validation --small
# without tiling technique
python evaluate_FlowFormer.py --dataset sintel --small

	with tile	w/o tile
clean	1.21	1.32
final	2.61	2.68

Generating the submission for the Sintel and KITTI benchmarks. The corresponding config file is configs/submission.py.

python evaluate_FlowFormer_tile.py --eval sintel_submission
python evaluate_FlowFormer_tile.py --eval kitti_submission

Visualizing the sintel dataset:

python visualize_flow.py --eval_type sintel --keep_size

Visualizing an image sequence extracted from a video:

python visualize_flow.py --eval_type seq

The default image sequence format is:

├── demo_data
    ├── mihoyo
        ├── 000001.png
        ├── 000002.png
        ├── 000003.png
            .
            .
            .
        ├── 001000.png

License

FlowFormer is released under the Apache License

Citation

@article{huang2022flowformer,
  title={{FlowFormer}: A Transformer Architecture for Optical Flow},
  author={Huang, Zhaoyang and Shi, Xiaoyu and Zhang, Chao and Wang, Qiang and Cheung, Ka Chun and Qin, Hongwei and Dai, Jifeng and Li, Hongsheng},
  journal={{ECCV}},
  year={2022}
}
@inproceedings{shi2023flowformer++,
  title={Flowformer++: Masked cost volume autoencoding for pretraining optical flow estimation},
  author={Shi, Xiaoyu and Huang, Zhaoyang and Li, Dasong and Zhang, Manyuan and Cheung, Ka Chun and See, Simon and Qin, Hongwei and Dai, Jifeng and Li, Hongsheng},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={1599--1610},
  year={2023}
}
@article{huang2023flowformer,
  title={FlowFormer: A Transformer Architecture and Its Masked Cost Volume Autoencoding for Optical Flow},
  author={Huang, Zhaoyang and Shi, Xiaoyu and Zhang, Chao and Wang, Qiang and Li, Yijin and Qin, Hongwei and Dai, Jifeng and Wang, Xiaogang and Li, Hongsheng},
  journal={arXiv preprint arXiv:2306.05442},
  year={2023}
}

Acknowledgement

In this project, we use parts of codes in:

flowformer-official's People

Contributors

Stargazers

Watchers

Forkers

xiaoyushi97 yimengqiao iizhb shuowang-ai ddoublesql coatz fanke-max aakanksha405 mikebilly rinze70 fualker yan-cu julianst forschumi maxmax2016 mohamedelrefaie yu-shaonian ouya-bytes a-wels jackzhousz xiaominli1997 slenderlori tianwenzhou deepuav nik1806 zenx0x celia0u0 johanvalero peterzs herculesyuan cv-opticalflow wasilone11 shneka-swamy beviswong anchor1021 zhounie96 fork-for-modify mushfiqulislam pilotier surajiitd ml-edu markchenyutian hjoonpark georgecyr ethan4289 jack251970 ddaisyaa

flowformer-official's Issues

just starred for 雷电将军!

evaluate_FlowFormer_tile.py has not attribute of MpiSintel_submission

Hi, thanks for this remarkable work. When I try to generate the Sintel submission results with evaluate_FlowFormer_tile.py, an error occurred: module 'datasets' has no attribute 'MpiSintel_submission'. I checked the datasets.py and there is indeed no mention of this keyword or the implementation of this attribute. Looking forward to your reply.

The output of pretrained model is a list with 32 tensors

When I used pretrained model in visualize_flow.py, everything is alright, compute_flow() function outputs a [1,2,H,W] shaped tensor.
But when I use the model in my own project, something strange happened. The compute_flow() function outputs a list which contains 32 tensors, which I guess are flows. But I only input two shape-correct frames. Have you ever met with this situation?

where is the utils and datasets module?maybe it sounds silly

CUDA out of memory when run "python visualize_flow.py --eval_type seq"

Thanks for your excellent work! I use the command "python visualize_flow.py --eval_type seq" to test the code, and got an error "RuntimeError: CUDA out of memory. Tried to allocate 1.48 GiB (GPU 0; 5.78 GiB total capacity; 2.33 GiB already allocated; 1008.06 MiB free; 2.99 GiB reserved in total by PyTorch)"
My gpu is 3060 with 6GB memory, I dont understand why "out of memory" during inference?

Does the network offer a confidence graph or is there any way to get it?

About C+T+S+K+H checkpoints

Thank you for sharing the code.

I wonder where I can find the checkpoint for the model stated as C+T+S+K+H in the paper.

Release Code

Wonderful work!
When will the code be released?

Question about training on Flyingthings3D

When I trained the model on the flyingthins3d dataset, I printed out the value of loss and found that it was NaN. Why is this?

Need more information about checkpoints

Does chairs.pth mean trained on FC with prerained weights of Imagenet or w/o

things.pth === chairs+things
sintel.pth === chairs+things+sintel
kitti ==== chairs+things+sintel+kitti+hd1k
things_kitti ==== chairs+things+kitti
flow_former_small ==== small version of mix training on c+t+s+k+h11k

please provide clear details

Thanks you

whether realse the code?

nice work! will you open the source code? or just the infer code is alse nice.

The training stage is so slow.

I am using 4xV100x32GB GPU to train with the "small_things_eval.py" config file, and using the "C+T+K+S" datasets strategy. The batch size is max set to 4, and I have also set the DataLoader with pin_memory=True and num_workers=16. However, I have noticed that the training process is extremely slow. Is there any way to speed up the training stage?

What is the reason of setting the num_workers=128 (so high) ??

Hi, Thanks for your awesome work. What is the reason of setting the num_workers=128 (so high) ?? Is it a typo??
at this line
@drinkingcoder

原来你也玩原神呀.jpg

为雷电将军点赞。做得很好！用在动作识别效果提升了！感谢！

How to find the model information

Thanks for your algorithm.

Is there a way to look into the model? I'm quite confused how to find the convolution functions in the code.

Where is your utils?

Hi @drinkingcoder .

Thank you for your great work!

I meet the problem when I run the evaluate_FlowFormer.py, Can you upload the utils folder?

Parts are missing in the encoder part

Hello, in the MemoryEncoder Class, it seems that there should be added the definition of self.layers. In the if self.cfg.feat_cross_attn part of forward() function, self.layers is used without definition.

CUDA out of memory while training on FlyingThings3D

Thank you for releasing your model code! I am really appreciating it.

I am trying to train FlowFormer on FlyingThings3D dataset, but even with batch size reduced to 1 and enabling --mixed_precision I keep getting CUDA out of memory errors.

I am using NVIDIA Titan GPUs which have 12 GiB memory. Is there a way to further reduce memory usage such that I can train on these GPUs without using the flowformer-small model?

question about feat_encoder

The code extracts the features of img1 and img2 respectively, while in RAFT, the features of two images are extracted at the same time by concat, is there a difference in effect?

Code snippets (RAFT) :
fmap1, fmap2 = self.fnet([image1, image2])

Code snippets (FlowFormer) :
feat_s = self.feat_encoder(img1)
feat_t = self.feat_encoder(img2)

It seems that both are weight-sharing, but will running the extractor twice be slower?

Is the pretrained weight available?

As the title saying, thanks~

Question about train on Flyingthing3D

Your work is excellent!
I have been troubled by a problem for a long time and hope to be able to get your help. When training on Flyingthings3D, the test epe of sintel and kitti fluctuates greatly. Is this normal?

parallel error

Traceback (most recent call last):
File "/root/FlowFormer-Official/train_FlowFormer.py", line 169, in
train(cfg)
File "/root/FlowFormer-Official/train_FlowFormer.py", line 89, in train
flow_predictions = model(image1, image2, output)
File "/root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/nn/parallel/data_parallel.py", line 186, in forward
outputs = self.parallel_apply(replicas, inputs, module_kwargs)
File "/root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/nn/parallel/data_parallel.py", line 201, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/nn/parallel/parallel_apply.py", line 108, in parallel_apply
output.reraise()
File "/root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/_utils.py", line 705, in reraise
raise exception
TypeError: Caught TypeError in replica 1 on device 1.
I am not sure whether my cuddn error.Because my cuddn report error too.
/root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/nn/modules/conv.py:456: UserWarning: Plan failed with a CuDNNError: cuDNN error: CUDNN_STATUS_BAD_PARAM
Exception raised from run_conv_plan at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:374 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x7f626257a897 in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libc10.so)
frame #1: + 0xe1c861 (0x7f62160ed861 in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #2: + 0x1095d83 (0x7f6216366d83 in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #3: + 0x1097c2c (0x7f6216368c2c in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #4: + 0x109817b (0x7f621636917b in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #5: + 0x107aca2 (0x7f621634bca2 in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #6: at::native::cudnn_convolution(at::Tensor const&, at::Tensor const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, long, bool, bool, bool) + 0x53f (0x7f621634c66f in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #7: + 0x32d0a9e (0x7f62185a1a9e in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #8: + 0x32e8251 (0x7f62185b9251 in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
frame #9: at::_ops::cudnn_convolution::call(at::Tensor const&, at::Tensor const&, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::SymInt, bool, bool, bool) + 0x2bb (0x7f624bbb8c2b in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
frame #10: at::native::_convolution(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, bool, c10::ArrayRef, long, bool, bool, bool, bool) + 0x13cb (0x7f624adf380b in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
frame #11: + 0x2e0089f (0x7f624bf8189f in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
frame #12: + 0x2e071fc (0x7f624bf881fc in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
frame #13: at::_ops::_convolution::call(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, bool, c10::ArrayRefc10::SymInt, c10::SymInt, bool, bool, bool, bool) + 0x344 (0x7f624b6ca6f4 in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
frame #14: at::native::convolution(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, bool, c10::ArrayRef, long) + 0x3b8 (0x7f624ade6e88 in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
frame #15: + 0x2e0013c (0x7f624bf8113c in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
frame #16: + 0x2e07068 (0x7f624bf88068 in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
frame #17: at::_ops::convolution::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, bool, c10::ArrayRefc10::SymInt, c10::SymInt) + 0x17b (0x7f624b68838b in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
frame #18: + 0x4503901 (0x7f624d684901 in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
frame #19: + 0x4504879 (0x7f624d685879 in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
frame #20: at::_ops::convolution::call(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, bool, c10::ArrayRefc10::SymInt, c10::SymInt) + 0x2d4 (0x7f624b6c94f4 in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
frame #21: + 0x19bd900 (0x7f624ab3e900 in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
frame #22: at::native::conv2d_symint(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::SymInt) + 0x16b (0x7f624adea76b in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
frame #23: + 0x2ff96c3 (0x7f624c17a6c3 in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
frame #24: + 0x2ff995d (0x7f624c17a95d in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
frame #25: at::_ops::conv2d::call(at::Tensor const&, at::Tensor const&, std::optionalat::Tensor const&, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::ArrayRefc10::SymInt, c10::SymInt) + 0x26e (0x7f624bced95e in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
frame #26: + 0x6853ad (0x7f62611803ad in /root/miniconda3/envs/flownet2/lib/python3.10/site-packages/torch/lib/libtorch_python.so)
frame #27: python() [0x4fdc87]

frame #30: python() [0x5099ce]
frame #32: python() [0x509b26]
frame #34: python() [0x509b26]
frame #38: python() [0x5cf883]
frame #41: python() [0x509b26]
frame #43: python() [0x509b26]
frame #47: python() [0x5cf883]
frame #50: python() [0x509b26]
frame #52: python() [0x509b26]
frame #56: python() [0x5cf883]
frame #59: python() [0x509b26]
frame #61: python() [0x509b26]
(Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:921.)
return F.conv2d(input, weight, bias, self.stride,

How to register a sintel account？

I know this question is ridiculous. I have tried all kinds of methods, but still no registration is successful. I tried to sign up with school email and google email, but it didn't work. The following information will pop up. I have also sent an email to [email protected], but have never received a reply. How did you sign up for a sintel account?

How to fine tune a model to a custom dataset

Hi @drinkingcoder,

First of all, thank you for your wonderful research. I am interested in training the FlowFormer model in custom data. Currently, it seems that training codes are only provided for benchmark datasets such as chairs, things, sintel, kiti, etc. I have some video data, and I'm going to cut the video into frames and use it for training.

I'm just starting research in the field of optical flow, so I don't know much about it.
Is there a code that processes custom dataset for training? Or do I have to make a custom dataset myself?
An additional, How do I create .flo file?

Thank you in advance for your response!

P.S.) What does autoflow mean in code that handles the args.stage of the train code?

Question about keep my own image size

I'd like to visualize an image sequence extracted from a video, and I really need the output .flo file size to be consistent with my input image. How can I do that?
My image is (120, 180, 3).
Thank you!

Question about the pretrained model

I tested it on the kitti training set using the model you provided on kitti, and I got an epe of 0.48 and a f1 of 0.87. Does not match epe0.53, f1 1.11 in your paper.

Question about the result on Sintel benchmark

May I ask whether the results on sintel in the paper do not use the 'warm start' policy? If so, have you tested the results of using the 'warm start' policy?

About generalization results in Table 1 and Table 3

Hi, thank you for the inspiring work!
I have a question about the results on generalization performance in Table 1 and Table 3.

In Table 1,

In Table 3,

What is the difference between those two results on Sintel even though the results on KITTI are same?

Flow output is incorrect

I downloaded the MPI Sintel (European Mirror) data from http://sintel.is.tue.mpg.de/ and when I visualize the ground truth flow using the flow_viz method from frame_utils I get such outputs:

The first half of the frame is present twice in the Flow. Is it supposed to be like that or Am I missing something?

How can we extract optical flow features using Flowformer on own video dataset?

Hi @drinkingcoder .

Thank you for your great work!

What are the necessary steps that we can extract optical flow features using Flowformer on own video dataset

Training on C+T

Thanks for your impressive works.
I use your codes and run command

./run_train.sh

When it comes to train model on flyingthings, I find the performance on Sintel(train) is much better than your results provided in paper's Table 1. The only difference is that I change the batch size to 4 due to the memory limit. The logs are shown in below:

The experiment name of the blue curve is: things/latentcostformer/cost_heads_num[1]vert_c_dim[64]cnet[twins]pretrain[True]add_flow_token[True]encoder_depth[3]gma[GMA]cost_encoder_res[True] (08_19_21_07), and the orange curve is the logs of training on flyingchairs: chairs/latentcostformer/cost_heads_num[1]vert_c_dim[64]cnet[twins]pretrain[True]add_flow_token[True]encoder_depth[3]gma[True]cost_encoder_res[True]arxiv2(08_17_16_47).
So, I want to know why the performance is different from the one reported in the paper, is there any difference of codes?
Looking forward to your kind reply!

Training on a custom dataset

Hi @drinkingcoder,

Really nice work! Really cool to see the use of transformers in optical flow. I have more of a question than an issue. I am interested in training FlowFormer on my own dataset. Is it possible create your own dataset? If so, can you give any recommendations / tools to help curate your own dataset? For context, I know very little about optical flow; I just learned about the field last month.

Thanks,
Owen

Question about parameters printed during training

General Thunder XP Hello! I absolutely like this task of yours, so I'd like to figure out some training parameter issues.
For example: first print round" | INFO | core.utils.logger:_print_training_status:19 - [ 100, [2.3958680439853386e-07]] 0.5127, 0.0426, 0.5117, 0.2746, 0.5180, 0.5117, 5.2532, "
What does 2.39.... in [] here mean? And I saw that there are four parameters epe, 1px, 3px, and 5px in the code, which one in the example represents? Then there are three more parameters and it is not clear what it means.
Sorry for my lack of foundation. Hope the author has time to answer it.

Query About Training Resources

Hello,

Could you provide details about the training environment and the duration it took to train the published model?

Thank you.

Using the code without GPU

I want to use the Flowformer on a system without GPU but the code uses CUDA. Can someone explain what I have to change to run the train.sh?

a question about evaluate_FlowFormer.py

Very interesting work!

I have a question about evaluate_FlowFormer.py, which reports AEPE 0.63/1.51 on Sintel clean/final for the provided things.pth checkpoint.
flow_pre = model(image1, image2)
flow_pre = padder.unpad(flow_pre[0]).cpu()
epe = torch.sum((flow_pre - flow_gt)**2, dim=0).sqrt()
For my run on Sintel images, flow_pre is of size [1, 2, 436, 1024] and flow_gt [2, 436, 1024]. (flow_pre - flow_gt)**2 is of size [1, 2, 436, 1024] and sum over dim 0 results in size [2, 436, 1024] for epe. Thus, "epe" doesn't sum over the horizontal and vertical flow errors.
flow_pre = torch.squeeze(padder.unpad(flow_pre[0]).cpu())
would results in size [436,1024] for epe and reports AEPE 1.01/2.40.

Could you please help check it?

Thanks.

question about how to estimate high resolution image

thanks for your works,
the number of parameters is about 16000000, it is impossible to estimate higher resolution image(2k or higher) in cuda.
Do you have a lighter model to solve it?

Question about the result on KITTI 2015 and Sintel benchmark

From your correction results, I found that tile technology can effectively reduce EPE. Do you have any results on Sintel and KITTI 2015 test sets without tile technology? Since none of the other methods use Tile techniques, I was wondering how Flowformer would fare better under the same conditions.

Is it possible to provide inference code?

Thanks for your great work!
I want to visualize some videos, Is it possible to provide inference code?

Code release of VideoFlow

Will the source code of VideoFlow[1], which is a wonderful work, be released?

[1] VideoFlow: Exploiting Temporal Cues for Multi-frame Optical Flow Estimation"

nice work! waiting for the code...

CUDA out of memory error

I get the error "CUDA out of memory", no matter the dataset when using my 4GB GPU. Can you reduce the Batch size or is there another solution you know?

What is self.layers in MemoryEnoder?

Thanks for the great work!
I am reading your code and am confused by this line. It seems self.cfg.feat_cross_attn is always set to False, right?

Best,
Qin

How to deal with that L20 is incompatible with your version?

I try to solve the environment as you mentioned in the readme. But i have some bug.
/root/miniconda3/envs/flowformer/lib/python3.7/site-packages/torch/cuda/init.py:125: UserWarning:
NVIDIA L20 with CUDA capability sm_89 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_61 sm_70 sm_75 compute_37.
If you want to use the NVIDIA L20 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

Occur error when running model sintel.pth

Hello, I meet an problem when running model stintel.pth,this is hard to solve for me,could you help me?

Question about the evaluation results with the pretrained weights

Thanks for your outstanding work.
I have a question about the evaluation results with the weight of C+T training (i.e., things.pth). With the default command as follows,
python evaluate_FlowFormer_tile.py --eval sintel_validation
python evaluate_FlowFormer_tile.py --eval kitti_validation
it obtains the EPE of [0.939294, 2.329442] on Sintel, which matches the reported results (clean 0.94, final 2.33). However, the results on KITTI are [4.531271, 15.322450], which seems to drop a lot (compared with the scores of [4.09† 14.72†] in the paper). Do I miss something for KITTI evaluation?

Training parameter problem when train on kitti

Why use an image size of [432, 960] when training on kitti? The original image height of kitti is only 375.

About evaluate_FlowFormer_tile

When i adopted validate_kitti to validate Kitti training set basd on model trainged in C+T, what is the meaning of the TRAIN_SIZE = [376, 720]? While the traing size is [432, 960] in configs/thing.py.
In addition, I directly used validate_kitti in evaluate_FlowFormer_tile to validate my own model and got very poor results.
Did I miss some details.

what is the number of flops

just need it for some comparisons. Thank you so much for sharing this great project