Comments (14)
Actually, current frame can be randomly sampled in the training process, when this occurs, there is an aggregation for current frame.
from flow-guided-feature-aggregation.
"current frame can be randomly sampled in the training process.". Is this in the current code? if is, where can I find it? thank you.
from flow-guided-feature-aggregation.
you can find it in the function "get_triple_image" (in lib/utils/image.py)
from flow-guided-feature-aggregation.
but function "get_triple_image" is not used in project. and in 'train_end2end.py', aggregation is done for 2N frames( no current frame), but 2N+1 frames( including current frame) when testing. can you explain more? thank you.
from flow-guided-feature-aggregation.
For the former question "get_triple_image is not used":
In the training process, we load training data in function "AnchorLoader"(in fgfa_rfcn/train_end2end.py), in function "AnchorLoader" (in fgfa_rfcn/core/loader.py), we use function "self.get_batch_individual()" to fill in provide_data and provide_label. Then in the 349^th line of function "self.get_batch_individual" , we refer to function "self.parfetch", in function "parfetch"(in fgfa_rfcn/core/loader.py line 357) we jump to function "get_rpn_triple_batch" (in lib/rpn/rpn.py line 102), there is the function "get_triple_image" . In that function, current frame can be randomly sampled.
For the latter question "no current frame is used in training process":
The current frame can be randomly sampled according to function "get_triple_image". :)
from flow-guided-feature-aggregation.
Thank for your answer. Now I have another question since I am not familiar with MXNet. The training network has three part inputs: data, data_before and data_after. But the batchsize is 2 when training, it even can not contain 3 frames. Is there a mechanism to train a part of network in MXNet? thank you.
from flow-guided-feature-aggregation.
Actually, in the training process, one batch only deal with the current frame (the other two frames for reference), the detail can be referred from paper https://arxiv.org/abs/1703.10025 In other words, the composition of those three frames shall be handled in one batch.
from flow-guided-feature-aggregation.
You means that 1 batch contains three frames, and it contains six frames when batchsize is 2 ?
from flow-guided-feature-aggregation.
@zhengzhugithub For training strategy, please refer to our paper. We use 4 GPUs during training, with each GPU holding one mini-batch. Loss function is applied on Eq.3. During training, temporal dropout was applied to avoid out-of-memory, i.e. we randomly sample 2 frames for feature aggregation.
from flow-guided-feature-aggregation.
@zhengzhugithub Also you can refer to Table. 3 in our paper, which validated our temporal dropout trick, i.e. randomly sample 2 frames for training is enough.
from flow-guided-feature-aggregation.
@einsiedler0408 When there is only 1 GPUs, one mini-batch only contains 2 frames. But the training network needs 3 frames at least, how to deal with it? thank you
from flow-guided-feature-aggregation.
@zhengzhugithub Please refer to Eq.2 and Eq.4 in our paper. The adaptive weight f_(j->i) for aggregated f_i is proportional to cosine(f_(j->i)^e, f_i^e).
from flow-guided-feature-aggregation.
@zhengzhugithub So the answer is that we do not need three frames during training, but we need f_i (features from the current frame) to compute the adaptive weight.
from flow-guided-feature-aggregation.
thank you. problem sovled.
from flow-guided-feature-aggregation.
Related Issues (20)
- How to get mAP by motion
- Installing mxnet using pip works! HOT 5
- error in function vid_eval_motion: What is motion_iou ?? HOT 4
- resnext backbone version
- where is ILSVRC2015_DET? HOT 2
- ILSVRC2015 DET. Whats that for?
- About config.py HOT 1
- 作者 你好 有pytorch版本的吗
- People Detection
- the reason why imagenet_det is needed.
- pip install mxnet-cu100 == 1.5.0
- make -j4
- 数据读入的时候好像是乱序的,感觉并不是前后9帧的特征融和
- 关于光流的方向
- How to organize the ImageNet dataset for training? HOT 3
- Regarding Batch Size for training and testing
- How to visualize the output after testing the data?
- 运行sh ./init.sh出错
- Killed during training processing
- How can I perform mixed training with DET and VID datasets, considering that the DET dataset does not consist of consecutive frames?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from flow-guided-feature-aggregation.