hello.
I was so impressed with the amazing CoVR task and results proposed by the authors that I tried to reimplement the code.
The code worked well, but the results are quite lacking compared to the highlighted part of Table 2 below.
My results on Webvid dataset:
{'R1': 39.7868, 'R5': 64.722, 'R10': 74.2869, 'R50': 91.4578, 'R_mean': 59.5986} after 5 epochs.
Authors results:
I know that batch size affectcontrastive learning (nce,hn-nce...etc) , but I didn't expect this much difference, have you ever checked the difference in recall scores based on batch size? You mentioned batch size 2048 when you wrote a paper.
I ask because due to the limited GPUs available to me (MAX 48GB VLAM - A SINGLE A6000) I set the training batch to 48, which seems to have caused a big performance drop.
Again, thanks for the great research and I look forward to your response.