mcc-wh / token Goto Github PK

View Code? Open in Web Editor NEW

57.0 2.0 6.0 6.95 MB

Official implementation of the AAAI 2022 paper "Learning Token-based Representation for Image Retrieval"

License: MIT License

Python 99.48% Shell 0.52%

image-retrieval aggregator

token's People

Contributors

Stargazers

Watchers

Forkers

peternara huangjh98 dannielge zivzone mymuli zxyup

token's Issues

Which model is used as RetrievalNet?

Token/train.py

Line 100 in 928a6cc

model = RetrievalNet(args.classifier_num).to(device)

RetrievalNet(args.classifier_num) does not refer to any models defined in networks.py, is this correct?

BatchNorm1d 统计参数出现NaN的问题

你好，作者。
在你的代码中，有两处使用了BatchNorm1d，而我在训练过程中发现，随着迭代次数的增加，其统计参数在未知原因下会变成NaN，由于BN的统计参数不影响训练过程，因此训练完全没有异样，但在测试时就会出现问题。
根据我的检查，保存的模型参数中，有且只有BatchNorm1d的统计参数出现了NaN的问题。
请问你是否遇到这类情况？（我可以确保输入不是全0，batchsize不是1）

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Did you also make this mistake, and what caused it?

can you public pretrain model ? thanks

Performance difference between papers and released weights

Hello @MCC-WH
Thank you for sharing the good code.

I have noticed a huge performance difference between your paper and the released weight, especially on the +1M experiments.

	ROxf-M	+1M	RPar-M	+1M	ROxf-H	+1M	RPar-H	+1M
TOKEN-R50-Paper	79.42	73.68	88.67	77.56	59.48	49.55	76.49	58.92
TOKEN-R50-Released weights	79.79	67.36	88.08	74.33	62.68	45.70	75.49	52.68
TOKEN-R101-Paper	82.28	75.64	89.34	79.76	66.57	51.37	78.56	61.56
TOKEN-R101-Released weights	82.16	70.58	89.40	77.24	65.75	47.46	78.44	56.81

The same goes for the DELG you reproduced.

	ROxf-M	+1M	RPar-M	+1M	ROxf-H	+1M	RPar-H	+1M
DELG-R101-Reproduced-Paper	78.24	68.36	88.21	75.83	60.15	44.41	76.15	52.40
DELG-R101-Reproduced-Released weights	78.55	66.02	88.58	73.65	60.89	41.75	76.05	51.46

Could you please check if the performance of the paper has errors?
If the performance of the paper is correct, could you please share the model (R50-Token, R101-Token) weights used in the paper to check the paper's performance?
It should also be noted that this may also affect the reviews/results of many (landmark) image retrieval papers submitted or to be submitted to the conferences/journals.

Train my own dataset

Can you tell me how to train on my own dataset, what is the dataset format like?

Have you removed the overlapping classes between GLD and Oxford/Pairs?

Hi, authors. Impressive paper and great results! Have you removed the overlapping classes between GLDv2 and Oxford/Paris during training? In GeM and DELF papers, they claim that they removed the overlapping classes, but I did not find this claim in your paper. I recognize your contribution no matter you removed the overlapping or not. Your model outperforms DELG a lot, which is trained on the same GLDv2. I am just out of interest. Thank you!

About the training logger and best_checkpoint.pth

Hi, authors. I am impressed by your great work. And I am now trying to run your code with the same configuration(4 GPUs and the same parameters and the same training dataset). At the first epoch the loss gradually decreases from 18.3 to 17.5. But I got a sharp rise in the training loss at the second epoch. (from 18 to 200+). The loss gradually went to 7000+ in the fourth epoch. In addition, the top 5 error is always more than 99%. To debug this error, I am now generating more logs to investigate. Maybe it is caused by an invalid gradient. If possible, could you please share your training_logger and val_logger results with me? And I will appreciate it if you can share the "Best_checkpoint.pth". Thank you!

what I have changed
"""
configdataset.py->GLDv2_build_train_dataset(csv_path, clean_csv_path, image_dir, output_directory, True, 0.2, 0) from False->True

And the val_logger plot is using the same data as the training logger.
"""

论文中的网络结构与开源代码网络结构不同

感谢作者！但是还有一个疑问想请教一下。
在论文中，特意列出了Table 5，以论证learned tokenizer是弱于atten-based tokenizer的，但是在代码中有query项，这个query就是自行学习的，这也使得最后的attention map也是基于query对原始特征的变换后得到的。这里面是否有论文与代码不一致的问题呢？

Can't get the accuracy mentioned in the paper

I use experiment.sh to train the model，but the model loss does not drop properly，Tested on the open source test set, the results are as follows

70/70 done...>> Test Dataset: roxford5k *** Feature Type: Token >>
mAP Eeay: 1.0, Medium: 1.78, Hard: 0.91
mP@k[1, 5, 10] Easy: [0. 1.18 1.03], Medium: [1.43 2. 2.14], Hard: [1.43 0.86 1.14]
70/70 done...>> Test Dataset: rparis6k *** Feature Type: Token >>
mAP Eeay: 2.25, Medium: 4.41, Hard: 2.52
mP@k[1, 5, 10] Easy: [1.43 3.43 3.29], Medium: [2.86 7.43 6.71], Hard: [1.43 4.57 3.86]
Why is the score so low?

Where is the model weights?

Hi,Where is the model weights?

Cannot reproduce the performance reported on AAAI camera ready version.

Dear author, thanks so much for kindly providing the code. Meanwhile, your work is so great and interesting. However, I've tried to reproduce your work with the source code you provided but cannot obtain the performance even using all of the Google Landmark v2 clean instead of 80% of it. Also, I've gone through all of the issues reported here and found there's a big difference on the training parameters setting between the one you gave and log you showed [https://github.com//issues/7] in one of the issues. It seems you are using a batch size of 320 with initial learning rate of 0.01. Could you please kindly tell me which one is correct? I've tried to reproduce your performance based on the settings in the paper, but I failed. So could you tell me how exactly you obtain the performance? Thanks so much.