Code Monkey home page Code Monkey logo

covr's People

Contributors

lucas-ventura avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

covr's Issues

Discrepancy in CIRR test set

Hi,

I checked this #15 updated results but cannot reproduce the similar numbers on the CIRR test set although my validation set number is very similar.

I used this evaluation script and submit the files to the server:

python test.py test=cirr model/ckpt=cirr_ft-covr+gt

My numbers is

Method R@1 R@2 R@5 R@10 R@50 Recall_subset @ 1 Recall_subset @ 2 Recall_subset @ 3
Zero-shot 27.566 38.265 52.506 63.494 85.976 71.277 86.217 94.048
Train 38.241 51.205 67.518 78 94.217 77.277 91.036 96.337

My number on the validation set

Method R@1 R@5 R@10 R@50 Recall_subset @ 1 Recall_subset @ 2 Recall_subset @ 3
Zero-shot 29.06 55.2 66.13 87.54 72.73 87.39 93.51
Train 41.55 70.08 79.91 94.16 78.51 91.75 95.93

Could you provide your json files for CIRR test?

Wget issue.

Don't know if this is an issue on my end but wget is having this problem when I run it:

bash tools/scripts/download_pretrained_models.sh
Select the model to download:
1) All
2) WebVid-CoVR
3) CIRR
4) FashionIQ
Press Enter for default (All)
Enter your choice (1/2/3/4):
The ckpt_4.ckpt checkpoint already exists in outputs/webvid-covr/blip-large/blip-l-coco/tv-False_loss-hnnce_lr-1e-05/good/.
Do you want to overwrite? [y/N]: y
Downloading ckpt_4.ckpt checkpoint...
wget: unrecognized option '--show-progress'
Usage: wget [OPTION]... [URL]...

Try `wget --help' for more options.
Download failed.
The ckpt_5.ckpt checkpoint already exists in outputs/cirr/blip-large/webvid-covr/tv-False_loss-hnnce_lr-0.0001/base/.
Do you want to overwrite? [y/N]: y
Downloading ckpt_5.ckpt checkpoint...
wget: unrecognized option '--show-progress'
Usage: wget [OPTION]... [URL]...

Try `wget --help' for more options.
Download failed.
The ckpt_5.ckpt checkpoint already exists in outputs/fashioniq-all/blip-large/webvid-covr/tv-False_loss-hnnce_lr-0.0001/base.
Do you want to overwrite? [y/N]: y
Downloading ckpt_5.ckpt checkpoint...
wget: unrecognized option '--show-progress'
Usage: wget [OPTION]... [URL]...

Try `wget --help' for more options.
Download failed.

Don't know if this a problem on my end. Getting rid of -show-progress did fix it though.

Question about the Increase from 1.2M Paired Videos to 1.6M Triplets

Thank you for sharing your research results.
I have a question related to the data generation process.

According to the paper, after going through the "Filtering caption pairs" step, 1.2M paired videos remained, and modifications were created using them. Subsequently, after filtering the video pairs, a total of 1.6M triplets were produced.

The count has increased by 0.4M compared to the paired videos. Could you explain how this happened?
My guess is that the pairs were used bidirectionally to create triplets (1.2M โ†’ 2.4M), and then decreased after filtering (2.4M โ†’ 1.6M).

Your clarification on this would be greatly appreciated!

please provide feature file

Hi,
thanks for your great work!
But this dataset is too large to download, can you provide the blip feature file?
Any reply will be helpful!

Question about the test of CIRR

image

When calculating the similarity between a query and images in CIRR, it seems that the entire image database only includes the reference image mentioned in the triplets, rather than considering all the images in the entire test set. This seems to be somewhat problematic.

readme file error

to download the annotation files,
the file says to "bash tools/scripts/download_annotations.sh covr"
however, the file name is download_annotation.sh
so, it should be bash tools/scripts/download_annotation.sh covr
where there is no s in the end

Does batch size have a significant impact on performance?

hello.

I was so impressed with the amazing CoVR task and results proposed by the authors that I tried to reimplement the code.

The code worked well, but the results are quite lacking compared to the highlighted part of Table 2 below.

My results on Webvid dataset:

{'R1': 39.7868, 'R5': 64.722, 'R10': 74.2869, 'R50': 91.4578, 'R_mean': 59.5986} after 5 epochs.

Authors results:
cap

I know that batch size affectcontrastive learning (nce,hn-nce...etc) , but I didn't expect this much difference, have you ever checked the difference in recall scores based on batch size? You mentioned batch size 2048 when you wrote a paper.

I ask because due to the limited GPUs available to me (MAX 48GB VLAM - A SINGLE A6000) I set the training batch to 48, which seems to have caused a big performance drop.

Again, thanks for the great research and I look forward to your response.

The inquiry about test CIRR dataset

Hi, thank you for this wonderful work.

I appreciate you providing the code.

I wonder how to calculate the recall performance of CIRR

I ran the code, but I think the code saves the image list of top-50 similarities, doesn't calculate the recall performance

So I checked the annotations of CIRR test-1, However, there is no label of the target image, only 'members'.

How do I calculate the recall performance of CIRR?

It would be beneficial to me

The inqury about test target video embedding vector

Hi! I'm Cheol-Ho Cho.

I have a question about your work, especially on the test target video embedding vector

I read in your paper that the test target video embedding vector is computed by weighted mean and it is helpful to boost performance.

However, your implemented code does not seem to be computed weighted mean.

Could you explain it more specifically?

Best regard, Cheol-Ho

readme issues

I said it is "bash tools/scripts/download_annotations.sh cirr" and "bash tools/scripts/download_annotations.sh fiq" that hve an additional 's' at the end of the word.

fashionIQ datasets class issue

in the configs/test/fashioniq-dress(shirt or toptee).yaml
there is
targets: ${paths.work_dir}/annotation/fashion-iq/split.toptee.val.json

however the split.toptee.val.json file doesn't exist

can i get a split.toptee.val.json from you?

What is the search space for a query in test set??

Hi,
I am slightly confused about the test set's search space. Given a query (Image/video and text) from the test set, what is the search space for this query? I am assuming that we are searching over all the possible target videos in the test set. I'd appreciate it if you could confirm this.
Thanks

Question about training multiple frames and fusion mechanism

Hi, thank you for your great work.

I have run the code and noticed that the given code only provides for single-frame training and missing the fusion mechanism (MLP, CA). It would be wonderful if you could provide the full version of those functions.

Thank you for this wonderful work.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.