Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Problems about reproducing the same VQA-X filtered scores reported in paper about nlxgpt HOT 3 CLOSED

kaipo-chang commented on September 15, 2024 1

Problems about reproducing the same VQA-X filtered scores reported in paper

from nlxgpt.

Comments (3)

fawazsammani commented on September 15, 2024

Hi @kaipo-chang
Are you changing any of the hyperparameters during training? Are you initializing with the pretrained model? Are you running the filter_and_get_scores function? Are you using the coco-caption i indicated in the readme?

I recommend you first use the checkpoint I provided and see if you can get the scores reported in the paper. Or you can directly provide the json output result I provided. If you do not get the correct scores, then there is probably a different coco-caption you are using?

from nlxgpt.

kaipo-chang commented on September 15, 2024

Thank you for your rapid reply. After your kindly reminder, I found that I forget to set finetuned_pretrained to True. Therefore, the distilGPT2 is initialized from Huggingface. Sorry for making this stupid error.

On the other hand, I would like to ask you why the clip-vit encoder would be fixed during finetuning. Do you have done the related experiments?

from nlxgpt.

fawazsammani commented on September 15, 2024

@kaipo-chang finetuning clip is not that safe and easy as you expect. You need to follow carefully very specific guides and hyperparameters in order for it to work. And above all that it may not work at the end. You can find these guides and hyperparameters somewhere in one of the issues of the clip github repo i think. I remember following them, and it ended up not working at all. Also, I think keeping a strong modality featurizer fixed (e.g., image features) and training the other (text) has a good impact (topic of the popular LiT paper published also in CVPR 2022). Also, training becomes much faster in this case. You may try finetuning CLIP, and would love to hear from you if you could make it work!

Feel free to open this issue again if you still face any problems!
Regards

from nlxgpt.

Recommend Projects

Problems about reproducing the same VQA-X filtered scores reported in paper about nlxgpt HOT 3 CLOSED

Comments (3)

Related Issues (13)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent