Code Monkey home page Code Monkey logo

nlxgpt's People

Contributors

cvpr2022annonymous avatar fawazsammani avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

nlxgpt's Issues

the link is broken

Hi
The "annotations" link is broken.

I can't open this link

(404. That’s an error.
The requested URL was not found on this server. That’s all we know.)

what should I do?

best checkpoint

Thank you for the response to question 1. Do you mean 2-3h or 6h is the time for training 30 epochs?
And how to determine which epoch achieves the optimal model? Is it the epoch where a certain metric (such as B-4) achieves the best value on the validation set? Or something else?

How to run pretrained model VQA-X with single image in Google Colab

Hi!

I am wanting to run the pretrained VQAX_p model (the same one that is used in the 'Explanations with Natural Text' Hugging Face demo). I have the project files and dependencies all imported into Google Colab, but I am unsure which files/functions I should be using in order to get explanation on one single image.

For help

Hello, we are also doing relevant research. Could you please provide the pre-training weights for caption in your paper?
image

VQA_X finetuning

Hello!

I am attempting to finetune the VQA_X model and am running into some confusions about the data required.

I currently have a dataset of images and captions prepared and formatted similar to vqaX_test_annot_full.json and vqaX_test_annot_exp.json with one-to-one image/annotation pairs along with information to the file path of the jpeg file for each image.

Do I also need to prepare an additional set of data formatted similar to vqaX_val.json & vqaX_test.json with answers, explanations, and the image_id and name in order to do finetune training on the model, or am I able to do so only with the dataset mentioned above?

Thanks

Pre-training step

Hi @fawazsammani,

Since the repo provides pretrained models, but not a script for pretraining, I am wondering what split to choose to pretrain on the four datasets mentioned in the paper (i.e., coco captions, flickr30k, VG and image paragraph captioning). I think this is not well described in the paper. Would I need to split the datasets for pre-training or can I pre-train the model on the entire datasets without splitting?

GPU and Train time

Thanks for your excellent work.
For NLE models on different datasets (VQA-X, ACT-X, e-SNLI-VE), how many GPUs are required for the pre training and finetune stages?And how many hours are required for the pretraining and finetune stages?

Problems about reproducing the same VQA-X filtered scores reported in paper

Hi @fawazsammani,

Thanks for your excellent work. I am very appreciated your work! However, I failed to reproduce the same scores on VQA-X datasets, and thus I want to check whether I was on the correct way about the model usage in models section.

Are the models in models section the checkpoints that could have the approximated performance reported on paper? In my experiments, the pretrained vqaX checkpoint model could only achieve 103 on filtered CIDEr. Or, should I use the checkpoints to reproduce? Could you please give me some instructions to reproduce the scores on your reported paper?

'prepare_data' folder and README incomplete

Hi, I was hoping to use this model to explain predictions for a different task than used in the paper which will require me to prepare explanations myself as well. It seems that the prepare_data folder that is referenced in the explain_predict folder is empty though. Will there be instructions coming soon for the specific steps the authors did for this process? Thanks!

question about visual attention map

Hi @fawazsammani.

First of all thank you once again for providing the tutorial for single image usage.

I was playing around with the model and I am curious about one thing.
Together with the textual explanation, we can also have the visual attention explanation with the attention map.
So, I was wondering if this visual explanation is computed considering the entire sentence output (= answer + explanation).
It is possible to divide the visual attention map in due different images: one that focuses on the classification part (answer) and another one that focuses on the explanation part. So in the first one I will have highlighted areas that are important for the answer prediction and in the second one it will be shown only the most important parts for explanations only.

Sorry for my lack of knowledge regarding this problem.
Thank you for your time.
Best wishes.

No such file or directory: 'cococaption/annotations/vqaX_test_annot_exp.json'

Hi @fawazsammani,

Thank you for the sharing the code!
I have a little issue in training the model on VQA-X. The training script complains that file cococaption/annotations/vqaX_test_annot_exp.json is missing. Indeed there is no such a file in that folder (cf.~https://github.com/ruotianluo/coco-caption/tree/ea20010419a955fed9882f9dcc53f2dc1ac65092/annotations). Could you help me with this issue? Thank you.

Below is the full error message:

Evaluation: Finished 1967/1968          loading annotations into memory...
Traceback (most recent call last):
  File "/data/lee/home/Projects/nlxgpt/vqaX.py", line 595, in <module>
    filter_and_get_scores(
  File "/data/lee/home/Projects/nlxgpt/vqaX.py", line 126, in filter_and_get_scores
    coco = COCO(annFileExp)
  File "/data/lee/home/Projects/nlxgpt/cococaption/pycocotools/coco.py", line 76, in __init__
    dataset = json.load(open(annotation_file, 'r'))
FileNotFoundError: [Errno 2] No such file or directory: 'cococaption/annotations/vqaX_test_annot_exp.json'
Traceback (most recent call last):
  File "/export/home/lee/miniconda3/envs/nlx-gpt/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/export/home/lee/miniconda3/envs/nlx-gpt/lib/python3.9/site-packages/accelerate/commands/accelerate_cli.py", line 43, in main
    args.func(args)
  File "/export/home/lee/miniconda3/envs/nlx-gpt/lib/python3.9/site-packages/accelerate/commands/launch.py", line 837, in launch_command
    simple_launcher(args)
  File "/export/home/lee/miniconda3/envs/nlx-gpt/lib/python3.9/site-packages/accelerate/commands/launch.py", line 354, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/export/home/lee/miniconda3/envs/nlx-gpt/bin/python3.9', 'vqaX.py']' returned non-zero exit status 1.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.