Hello, I am trying to reproduce OmniMotion results on TAP-Vid DAVIS. I preprocesse

The paper mentions these two methods： <a target="_blank" rel="noopener noreferrer"

For each TAP-Vid DAVIS video I apply the following: Place the

Thank you for your questions. <a href="https://drive.google.com/driv

Hello! Given this issue and <a class="issue-link js-issue-link" data-error-text="F

Training and evaluating model on TAP-Vid DAVIS produces different results about omnimotion HOT 10 OPEN

qianqianwang68 commented on August 16, 2024

Training and evaluating model on TAP-Vid DAVIS produces different results

from omnimotion.

Comments (10)

64327069 commented on August 16, 2024 2

I would like to ask how to evaluate the result, there is no eval code in the repo, neither the guidance of evaluation. Would you like to provide the eval code? I am reproducing the code now.

from omnimotion.

AssafSinger94 commented on August 16, 2024 1

Thinking of this situation, I wanted to ask , would it be possible for you to provide the pre-trained weights for TAP-Vid DAVIS? I think that would be the optimal solution, and much simpler than retraining the model and making sure everything works perfectly. It would be deeply appreciated.

from omnimotion.

Guo-1212 commented on August 16, 2024 1

The paper mentions these two methods：

But what kind of changes can be made to the published code to get another method, or can you publish the code for the other method?

from omnimotion.

AssafSinger94 commented on August 16, 2024

For each TAP-Vid DAVIS video I apply the following:

Place the frames (which are in 256x256 resolution) under color dir, as described in the preprocessing instructions.
Run python main_processing.py --data_dir ../tapvid_davis/processed_256/$i/ --chain (after completing all necessary preprocessing instructions).
python train.py --config configs/default.txt --data_dir ./tapvid_davis/processed_256/$i/ --save_dir ./tapvid_davis/processed_256/$i/ --num_iters 200000
Extract predictions for query points & compute metrics.

from omnimotion.

64327069 commented on August 16, 2024

I mean eval the metric of OA, AJ, etc

from omnimotion.

qianqianwang68 commented on August 16, 2024

Thank you for your questions.

This folder contains a script for evaluation (eval_tapvid_davis.py) and the pre-trained weights which you can use to reproduce the exact result in the paper.

To run the evaluation:

first download the py and zip file, unzip the zip file, and put both in the project directory.
The paper's results were generated by an old model architecture which is slightly different from the released one, so please do the following modifications: change the hidden_size here from [256, 256, 256] to [256, 256]. And then change this line to nn.Linear(input_dims + input_dims * ll * 2, proj_dims), nn.ReLU(), nn.Linear(proj_dims, proj_dims).
run python eval_tapvid_davis.py. If the evaluation runs successfully, you should get this output which matches the number in the paper:
30 | average_jaccard: 0.51746 | average_pts_within_thresh: 0.67490 | occlusion_acc: 0.85346 | temporal_coherence: 0.74060

Regarding the hyperparameters: yes we used a different set of hyperparameters for the tap-vid evaluation (but they were the same across all tap-vid videos). The reason is that tap-vid videos have much lower resolutions (256x256), and we found RAFT performance downgrades and relying more on the photometric information by upweighing its loss helps improve the performance. I hope this is helpful for you at least for now. Please allow me some time to integrate and organize things into the codebase and release more details.

from omnimotion.

Mixanik-43 commented on August 16, 2024

Hello!
Given this issue and #42 , here are some changes that should be applied to the default config to reproduce training+evaluation on TAP-Vid dataset from the omnimotion paper:

set args.min_depth = -0.5 and args.max_depth = 0.5
set args.use_affine = False
set args.num_iters = 200000
change the hidden_size here from [256, 256, 256] to [256, 256].
change this line to nn.Linear(input_dims + input_dims * ll * 2, proj_dims), nn.ReLU(), nn.Linear(proj_dims, proj_dims).

Is it correct? Are there any other changes required for the quantitative results reproduction?

Relying more on the photometric information by upweighing its loss helps improve the performance.

So, in your TAP-Vid training photometric loss weight was increasing from 0 to 10 over the first 50k steps and then staying fixed at 10, or was some other schedule applied?

from omnimotion.

nargenziano commented on August 16, 2024

Would you mind sharing the full config file used for the results in the paper?

from omnimotion.

Guo-1212 commented on August 16, 2024

Hello！
In the annotations folder, I can see that each video sequence corresponds to a pkl file, I would like to ask, how did this file get it?There is no such file in the training results, and I did not find the module that generated this file in the code.

from omnimotion.

YiJian666 commented on August 16, 2024

你好！在annotations文件夹中可以看到每个视频序列都对应一个pkl文件，我想问一下，这个文件是怎么得到的？训练结果中没有这个文件，我也没有找到该模块在代码中生成此文件。

你好，请问你找到了吗？方便说一下这个对应的pkl文件是怎么得到的吗？

from omnimotion.

Training and evaluating model on TAP-Vid DAVIS produces different results about omnimotion HOT 10 OPEN

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent