Code Monkey home page Code Monkey logo

Comments (14)

jeremy43 avatar jeremy43 commented on September 15, 2024

Actually, current frame can be randomly sampled in the training process, when this occurs, there is an aggregation for current frame.

from flow-guided-feature-aggregation.

zhengzhugithub avatar zhengzhugithub commented on September 15, 2024

"current frame can be randomly sampled in the training process.". Is this in the current code? if is, where can I find it? thank you.

from flow-guided-feature-aggregation.

jeremy43 avatar jeremy43 commented on September 15, 2024

you can find it in the function "get_triple_image" (in lib/utils/image.py)

from flow-guided-feature-aggregation.

zhengzhugithub avatar zhengzhugithub commented on September 15, 2024

but function "get_triple_image" is not used in project. and in 'train_end2end.py', aggregation is done for 2N frames( no current frame), but 2N+1 frames( including current frame) when testing. can you explain more? thank you.

from flow-guided-feature-aggregation.

jeremy43 avatar jeremy43 commented on September 15, 2024

For the former question "get_triple_image is not used":
In the training process, we load training data in function "AnchorLoader"(in fgfa_rfcn/train_end2end.py), in function "AnchorLoader" (in fgfa_rfcn/core/loader.py), we use function "self.get_batch_individual()" to fill in provide_data and provide_label. Then in the 349^th line of function "self.get_batch_individual" , we refer to function "self.parfetch", in function "parfetch"(in fgfa_rfcn/core/loader.py line 357) we jump to function "get_rpn_triple_batch" (in lib/rpn/rpn.py line 102), there is the function "get_triple_image" . In that function, current frame can be randomly sampled.
For the latter question "no current frame is used in training process":
The current frame can be randomly sampled according to function "get_triple_image". :)

from flow-guided-feature-aggregation.

zhengzhugithub avatar zhengzhugithub commented on September 15, 2024

Thank for your answer. Now I have another question since I am not familiar with MXNet. The training network has three part inputs: data, data_before and data_after. But the batchsize is 2 when training, it even can not contain 3 frames. Is there a mechanism to train a part of network in MXNet? thank you.

from flow-guided-feature-aggregation.

jeremy43 avatar jeremy43 commented on September 15, 2024

Actually, in the training process, one batch only deal with the current frame (the other two frames for reference), the detail can be referred from paper https://arxiv.org/abs/1703.10025 In other words, the composition of those three frames shall be handled in one batch.

from flow-guided-feature-aggregation.

zhengzhugithub avatar zhengzhugithub commented on September 15, 2024

You means that 1 batch contains three frames, and it contains six frames when batchsize is 2 ?

from flow-guided-feature-aggregation.

einsiedler0408 avatar einsiedler0408 commented on September 15, 2024

@zhengzhugithub For training strategy, please refer to our paper. We use 4 GPUs during training, with each GPU holding one mini-batch. Loss function is applied on Eq.3. During training, temporal dropout was applied to avoid out-of-memory, i.e. we randomly sample 2 frames for feature aggregation.

from flow-guided-feature-aggregation.

einsiedler0408 avatar einsiedler0408 commented on September 15, 2024

@zhengzhugithub Also you can refer to Table. 3 in our paper, which validated our temporal dropout trick, i.e. randomly sample 2 frames for training is enough.

from flow-guided-feature-aggregation.

zhengzhugithub avatar zhengzhugithub commented on September 15, 2024

@einsiedler0408 When there is only 1 GPUs, one mini-batch only contains 2 frames. But the training network needs 3 frames at least, how to deal with it? thank you

from flow-guided-feature-aggregation.

einsiedler0408 avatar einsiedler0408 commented on September 15, 2024

@zhengzhugithub Please refer to Eq.2 and Eq.4 in our paper. The adaptive weight f_(j->i) for aggregated f_i is proportional to cosine(f_(j->i)^e, f_i^e).

from flow-guided-feature-aggregation.

einsiedler0408 avatar einsiedler0408 commented on September 15, 2024

@zhengzhugithub So the answer is that we do not need three frames during training, but we need f_i (features from the current frame) to compute the adaptive weight.

from flow-guided-feature-aggregation.

zhengzhugithub avatar zhengzhugithub commented on September 15, 2024

thank you. problem sovled.

from flow-guided-feature-aggregation.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.