Code Monkey home page Code Monkey logo

Comments (10)

64327069 avatar 64327069 commented on August 16, 2024 2

I would like to ask how to evaluate the result, there is no eval code in the repo, neither the guidance of evaluation. Would you like to provide the eval code? I am reproducing the code now.

from omnimotion.

AssafSinger94 avatar AssafSinger94 commented on August 16, 2024 1

Thinking of this situation, I wanted to ask , would it be possible for you to provide the pre-trained weights for TAP-Vid DAVIS? I think that would be the optimal solution, and much simpler than retraining the model and making sure everything works perfectly. It would be deeply appreciated.

from omnimotion.

Guo-1212 avatar Guo-1212 commented on August 16, 2024 1

The paper mentions these two methods:
屏幕截图 2024-02-22 181011
But what kind of changes can be made to the published code to get another method, or can you publish the code for the other method?

from omnimotion.

AssafSinger94 avatar AssafSinger94 commented on August 16, 2024

For each TAP-Vid DAVIS video I apply the following:

  • Place the frames (which are in 256x256 resolution) under color dir, as described in the preprocessing instructions.
  • Run python main_processing.py --data_dir ../tapvid_davis/processed_256/$i/ --chain (after completing all necessary preprocessing instructions).
  • python train.py --config configs/default.txt --data_dir ./tapvid_davis/processed_256/$i/ --save_dir ./tapvid_davis/processed_256/$i/ --num_iters 200000
  • Extract predictions for query points & compute metrics.

from omnimotion.

64327069 avatar 64327069 commented on August 16, 2024

I mean eval the metric of OA, AJ, etc

from omnimotion.

qianqianwang68 avatar qianqianwang68 commented on August 16, 2024

Thank you for your questions.

This folder contains a script for evaluation (eval_tapvid_davis.py) and the pre-trained weights which you can use to reproduce the exact result in the paper.

To run the evaluation:

  1. first download the py and zip file, unzip the zip file, and put both in the project directory.
  2. The paper's results were generated by an old model architecture which is slightly different from the released one, so please do the following modifications: change the hidden_size here from [256, 256, 256] to [256, 256]. And then change this line to nn.Linear(input_dims + input_dims * ll * 2, proj_dims), nn.ReLU(), nn.Linear(proj_dims, proj_dims).
  3. run python eval_tapvid_davis.py. If the evaluation runs successfully, you should get this output which matches the number in the paper:
    30 | average_jaccard: 0.51746 | average_pts_within_thresh: 0.67490 | occlusion_acc: 0.85346 | temporal_coherence: 0.74060

Regarding the hyperparameters: yes we used a different set of hyperparameters for the tap-vid evaluation (but they were the same across all tap-vid videos). The reason is that tap-vid videos have much lower resolutions (256x256), and we found RAFT performance downgrades and relying more on the photometric information by upweighing its loss helps improve the performance. I hope this is helpful for you at least for now. Please allow me some time to integrate and organize things into the codebase and release more details.

from omnimotion.

Mixanik-43 avatar Mixanik-43 commented on August 16, 2024

Hello!
Given this issue and #42 , here are some changes that should be applied to the default config to reproduce training+evaluation on TAP-Vid dataset from the omnimotion paper:

  • set args.min_depth = -0.5 and args.max_depth = 0.5
  • set args.use_affine = False
  • set args.num_iters = 200000
  • change the hidden_size here from [256, 256, 256] to [256, 256].
  • change this line to nn.Linear(input_dims + input_dims * ll * 2, proj_dims), nn.ReLU(), nn.Linear(proj_dims, proj_dims).

Is it correct? Are there any other changes required for the quantitative results reproduction?

Relying more on the photometric information by upweighing its loss helps improve the performance.

So, in your TAP-Vid training photometric loss weight was increasing from 0 to 10 over the first 50k steps and then staying fixed at 10, or was some other schedule applied?

from omnimotion.

nargenziano avatar nargenziano commented on August 16, 2024

Would you mind sharing the full config file used for the results in the paper?

from omnimotion.

Guo-1212 avatar Guo-1212 commented on August 16, 2024

Hello!
In the annotations folder, I can see that each video sequence corresponds to a pkl file, I would like to ask, how did this file get it?There is no such file in the training results, and I did not find the module that generated this file in the code.

from omnimotion.

YiJian666 avatar YiJian666 commented on August 16, 2024

你好! 在annotations文件夹中可以看到每个视频序列都对应一个pkl文件,我想问一下,这个文件是怎么得到的?训练结果中没有这个文件,我也没有找到该模块在代码中生成此文件。

你好,请问你找到了吗?方便说一下这个对应的pkl文件是怎么得到的吗?

from omnimotion.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.