Code Monkey home page Code Monkey logo

Comments (3)

qianqianwang68 avatar qianqianwang68 commented on August 16, 2024 20

Thank you for your question! TAPIR is undoubtfully an amazing work. Our method and TAPIR are fundamentally different in the way they work and I believe they are complementary.

  • TAPIR and most tracking methods are feed-forward methods, but our method is a test time optimization-based method. TAPIR are trained on large amounts of video data, and when given a new video sequence at test time, it can be used to directly compute the raw tracking results for this video. Our method, on the other hand, is a test-time optimization method, which means our method needs to be optimized on each video separately (substantially slower!). To perform the optimization, our method takes the raw tracking results from existing methods as the noisy supervising signal. So methods like TAPIR provide input motion to our system, and our method can reconcile and complete the possibly noisy and inconsistent motion to get a global motion representation for the video. With better input motion, the results of our method will also likely get better. And as mentioned on TAPIR's webpage, our method "could potentially be used on top of TAPIR tracks to further improve performance." Note that TAPIR achieves much better tracking accuracy on the TAP-Vid benchmark than OmniMotion optimized with input motion data from RAFT.

  • Some other differences include: our method produces a compact representation of the motion of the entire video; our optimized motion tends to be more temporally coherent; our method can provide plausible locations for points when they are occluded; our method provides pseudo-3D reconstructions.

Lastly, in my opinion, we need both generalizable methods like TAPIR which learns very useful priors from data, and test-time optimization methods like ours that can take the noisy motion data and refine them for a particular video sequence for better quality and coherence.

from omnimotion.

dongxinyu1030 avatar dongxinyu1030 commented on August 16, 2024
  • To perform the optimization, our method takes the raw tracking results from existing methods as the noisy supervising signal.

For this step : "To perform the optimization, our method takes the raw tracking results from existing methods as the noisy supervising signal." Do your method need trajectories across all video frames or just frames before the current time t?

from omnimotion.

qianqianwang68 avatar qianqianwang68 commented on August 16, 2024

It takes the trajectories across all video frames.

from omnimotion.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.