Hi, May I ask what the hyperparameters settings are when using SF+CL

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

The hyper parameters of using SF+CLIP features on Charades-STA about qd-detr HOT 9 CLOSED

wjun0830 commented on June 12, 2024

The hyper parameters of using SF+CLIP features on Charades-STA

from qd-detr.

Comments (9)

awkrail commented on June 12, 2024 1

Thank you for your help. With lr_drop=40, lr 0.0002, lw_saliency=4, we finally reproduce the paper results.

from qd-detr.

zxccade commented on June 12, 2024

I couldn't reproduce the results reported in the paper on the Charades dataset with the SF+CLIP features and default hyperparameters.

from qd-detr.

wjun0830 commented on June 12, 2024

Hello! Sorry for being late due to our busy schedule for the rebuttal period.

Actually, we are very sorry that we currently only have the config file for QVHighlights for QD-DETR.
We suggest you try tuning the ['learning rate \in 1e-4, 2e-4', 'saliency loss ratio \in 1, 4'].

We also find that the results are not consistent on different machines due to some reason with our codebase for somewhat reason (we haven't figured out why).
So we recommend trying different parameters on your specific machine.

from qd-detr.

zxccade commented on June 12, 2024

Thanks for getting back to me in your busy schedule,

I've tried the parameter you mentioned, but the model could only converge to 45 at [email protected], which is far from the records in the paper. However, when I tried I3D features instead of SF+CLIP, the model could converge to 53 at [email protected], which is close to the records in the paper.

I guess I need to change some parameters. It would be better if you could help me find the opt.json file after your busy rebuttal period. Thanks.

from qd-detr.

awkrail commented on June 12, 2024

@wjun0830 @zxccade
Hi, I also cannot reproduce the experiments with Charades-STA with the default parameter.
The results are here, [email protected]=0.5 and far from 0.57 in the paper. Could you share the hyper-parameters?

"[email protected]": 51.88,
"[email protected]": 27.93,
"MR-full-mAP": 30.26,
"[email protected]": 60.92,
"[email protected]": 25.31

from qd-detr.

wjun0830 commented on June 12, 2024

We are again very sorry that our codebase is not very robust to different server settings.

Have you tried changing params as above?
You can also refer to Params for succeeding paper can be found in the appendix of https://arxiv.org/abs/2311.08835.

from qd-detr.

awkrail commented on June 12, 2024

Thank you for reply! I will try the hyper-parameters used in CG-DETR, let me take several hours to test it...

from qd-detr.

awkrail commented on June 12, 2024

@wjun0830
Sorry for bothering you again. They are hyper-parameters in CG-DETR. Except for CG-DETR-specific parameters, you used these parameters? Several differences exist from QD-DETR. For example, enc_layers=3, dec_layers=3, --lr 0.0002, --lw_saliency=4 are set (In QD-DETR, enc_layers=2, dec_layers=2, lr=0.0001, lw_saliency=1.0).

#### training
bsz=32
eval_bsz=32
num_dummies=45
num_prompts=2
total_prompts=10
lr_drop=400
enc_layers=3
dec_layers=3
t2v_layers=2
dummy_layers=2
moment_layers=1
sent_layers=1

PYTHONPATH=$PYTHONPATH:. python cg_detr/train.py \
--dset_name ${dset_name} \
--ctx_mode ${ctx_mode} \
--train_path ${train_path} \
--eval_path ${eval_path} \
--eval_split_name ${eval_split_name} \
--v_feat_dirs ${v_feat_dirs[@]} \
--v_feat_dim ${v_feat_dim} \
--t_feat_dir ${t_feat_dir} \
--t_feat_dim ${t_feat_dim} \
--bsz ${bsz} \
--results_root ${results_root} \
--exp_id ${exp_id} \
--max_v_l -1 \
--clip_length 1 \
--lr 0.0002 \
--lr_drop ${lr_drop} \
--n_epoch 200 \
--contrastive_align_loss_coef 0.002 \
--lw_saliency 4 \
--enc_layers ${enc_layers} \
--dec_layers ${dec_layers} \
--t2v_layers ${t2v_layers} \
--moment_layers ${moment_layers} \
--dummy_layers ${dummy_layers} \
--sent_layers ${sent_layers} \
--eval_bsz ${eval_bsz} \
--num_dummies ${num_dummies} \
--num_prompts ${num_prompts} \
--total_prompts ${total_prompts} \
${@:1}

from qd-detr.

wjun0830 commented on June 12, 2024

I remember that we havent changed the number of layers in QD detr.
Modification of number layers were implemented only in cg detr following works in iccv.
I remember lwsaliency and lr are the changes we have tuned for charades

from qd-detr.

The hyper parameters of using SF+CLIP features on Charades-STA about qd-detr HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent