Code Monkey home page Code Monkey logo

prophet's People

Contributors

bruceisme avatar mil-vlg avatar paradoxzw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

prophet's Issues

The process of image caption

Is there the process code about image caption? I find a "captions_okvqa.json" file in the assets about image caption, but I do not find the process code about this file

Could you provide a finetuned model for A-OKVQA dataset?

Hi, I am quite interested in your nice work and happy to see the code has been released!
A finetuned model for OK-VQA has been provided. It works.
I want to try your method on another dataset, A-OKVQA but don't find the finetuned checkpoint.
So could you provide the finetuned model for A-OKVQA dataset? Thanks!

mcan_530_okvqa.json

请问 我运行完finetune.sh生成的json文件 生成了 10096条数据 您给的mcan_530_okvqa.json 只有5048条
而且当我评估时准确率只有50%是因为什么 是流程错了吗 是用的您给的预训练模型

bug

image
Have you ever encountered this kind of bug?

The first candidate answer of your provided candidates_okvqa.json in assets.zip

Thank you very much for providing the code. I calculated the accuracy of the first answer on OKVQA val in the candidates_okvqa.json in assets.zip you provided. The code I run is following. It turns out that the accuracy is 47.06 instead of 53. Did I do something wrong?

import json

#load data
with open('candidates_okvqa.json') as f:
    answer_candidates = json.load(f)
with open('mscoco_val2014_annotations.json') as f:
    val_datasets_annotations = json.load(f)['annotations']

#organize answer list
val_datasets = []
for val_a in val_datasets_annotations:
    multi_answers = []
    for ans in val_a['answers']:
        multi_answers.append(ans['raw_answer'])
    row = {'question_id': val_a['question_id'], 'direct_answers': multi_answers}
    val_datasets.append(row)

#compute score for a predicted answer
def direct_scores(pred_answer, direct_answers):
    acc_num = 0
    cnt = 0
    for _, answer_id in enumerate(direct_answers):
        if pred_answer == answer_id:
            cnt += 1
    if cnt ==1:
        acc_num = 0.3
    elif cnt == 2:
        acc_num = 0.6
    elif cnt > 2:
        acc_num = 1
    return acc_num

#Calculate the accuracy of the first candidate answer for all samples
acc = 0.0
for single_sample in val_datasets:
    single_sample['DA_candidate'] = [each_answer['answer'] for each_answer in answer_candidates[str(single_sample['question_id'])]]
    score = []
    for i in single_sample['DA_candidate']:
        score.append(direct_scores(i, single_sample['direct_answers']))
    acc += score[0]
print(acc/len(val_datasets))

Looking forward to your reply.

50GB memory?

"To conduct the following experiments, a machine with at least 1 RTX 3090 GPU, 50GB memory", wherein "50GB memory" refers to Memory for CPU or GPU?

mcan_530_okvqa.json

你好,请问一下mcan_530_okvqa.json 是哪部分代码生成的 谢谢

Trained model

Can we use the model you have already trained from existing code?

The process of image caption

Is there the process code about image caption? I find a "captions_okvqa.json" file in the assets about image caption, but I do not find the process code about this file

您好,关于Caption这块有疑问想请教您一下

   这个多模态模型可以理解为:使用“caption”和“答案启发”方式将图片要素转为文字来进行两个模态的交互吗?
  “Caption的内容是需要使用an off-the-shelf captioning model将图像翻译成caption”,那么这个额外的图转文模型才是多模态交互关键点吗,这样的话这个额外的模型才是决定此模型关键把,如果额外模型效果不好,噪音也就很大调用gpt意义就不大了吧?

当我在训练stage1时预训练、微调和生成候选答案时报了一样的错OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the

当我在训练stage1时使用官方给的预训练、微调和生成候选答案命令时报了一样的错
OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like bert-large-cased is not the path to a directory containing a file named config.json
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode
在main.py中添加代码
TRANSFORMERS_OFFLINE=1 # 离线状态下可运行
又报了新的错误:
requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(104,'Connection reset by peer'))
请问该怎么解决呢

OpenAI's apikey

I can't call OpenAI's apikey on the rented server, is there any way, thank you

okvqa-stage1-pretrain

when l pretrain in okvqa use mcan model,it error
raise LocalEntryNotFoundError( huggingface_hub.utils._errors.LocalEntryNotFoundError: Cannot find the requested files in the disk cache and outgoing traffic has been disabled. To enable hf.co look-ups and downloads online, set 'local_files_only' to False.
During handling of the above exception, another exception occurred:
OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like bert-large-uncased is not the path to a directory containing a file named config.json.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.

so need download bert-large-uncased online? and run code offline?

Prerequisites Questions

Dear author, when I was processing "conda env create -f environment.yml" a error occured like this:
image
If its right to delete the "@v1.0"
I hope you can help me to answer this question, thank you very much.

Naive question on OK-VQA and A-OKVQA evaluation.

Hi @ParadoxZW @MIL-VLG , thanks for your grate project.

I am not very familiar with OK-VQA and A-OKVQA evaluation. Here are some naive questions:

  • OK-VQA and A-OKVQA have an open-ended QA setting. For each question, it has ~10 gt answers (although some answers are the same). Do you use exact match (vqav2-style, match at least 3 gt answers) to compute the accuracy?
  • Is it common to train on A-OKVQA train+val and conduct inference on A-OKVQA test?

KeyError: 179520 ??

while running command

bash scripts/pretrain.sh \
    --task ok --version okvqa_pretrain_1 --gpu 0

I met this problem:

Traceback (most recent call last):
  File "/root/autodl-fs/prophet-main/main.py", line 35, in <module>
    runner.run()
  File "/root/autodl-fs/prophet-main/prophet/stage1/pretrain.py", line 162, in run
    self.train(train_set, valid_set)
  File "/root/autodl-fs/prophet-main/prophet/stage1/pretrain.py", line 93, in train
    for step, input_tuple in enumerate(dataloader):
  File "/root/miniconda3/envs/prophet/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 652, in __next__
    data = self._next_data()
  File "/root/miniconda3/envs/prophet/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1347, in _next_data
    return self._process_data(data)
  File "/root/miniconda3/envs/prophet/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1373, in _process_data
    data.reraise()
  File "/root/miniconda3/envs/prophet/lib/python3.9/site-packages/torch/_utils.py", line 461, in reraise
    raise exception
KeyError: Caught KeyError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/root/miniconda3/envs/prophet/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
    data = fetcher.fetch(index)
  File "/root/miniconda3/envs/prophet/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/root/miniconda3/envs/prophet/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/root/autodl-fs/prophet-main/prophet/stage1/utils/load_data.py", line 136, in __getitem__
KeyError: 179520

whole project structure:

prophet-main
├── assets
│   ├── answer_aware_examples_okvqa.json
│   ├── answer_dict_aokvqa.json
│   ├── answer_dict_okvqa.json
│   ├── answer_dict_vqav2.json
│   ├── candidates_aokvqa_test.json
│   ├── candidates_aokvqa_val.json
│   ├── candidates_okvqa.json
│   ├── captions_aokvqa.json
│   ├── captions_okvqa.json
│   ├── examples_aokvqa_test.json
│   ├── examples_aokvqa_val.json
│   └── Untitled.ipynb
├── ckpts
│   └── epoch_6.pkl
├── CLIP
│   ├── clip
│   │   ├── bpe_simple_vocab_16e6.txt.gz
│   │   ├── clip.py
│   │   ├── __init__.py
│   │   ├── model.py
│   │   └── simple_tokenizer.py
│   ├── CLIP.png
│   ├── data
│   │   ├── country211.md
│   │   ├── prompts.md
│   │   ├── rendered-sst2.md
│   │   └── yfcc100m.md
│   ├── hubconf.py
│   ├── LICENSE
│   ├── MANIFEST.in
│   ├── model-card.md
│   ├── notebooks
│   │   ├── Interacting_with_CLIP.ipynb
│   │   └── Prompt_Engineering_for_ImageNet.ipynb
│   ├── README.md
│   ├── requirements.txt
│   ├── setup.py
│   └── tests
│       └── test_consistency.py
├── configs
│   ├── finetune.yml
│   ├── path_cfgs.py
│   ├── pretrain.yml
│   ├── prompt.yml
│   ├── __pycache__
│   │   ├── path_cfgs.cpython-39.pyc
│   │   ├── task_cfgs.cpython-39.pyc
│   │   └── task_to_split.cpython-39.pyc
│   ├── task_cfgs.py
│   └── task_to_split.py
├── datasets
│   ├── aokvqa
│   │   ├── aokvqa_v1p0_test.json
│   │   ├── aokvqa_v1p0_train.json
│   │   └── aokvqa_v1p0_val.json
│   ├── coco2014
│   │   ├── train2014
│   │   ├── train2014.zip
│   │   └── val2014
│   ├── coco2014_feats
│   │   ├── train2014
│   │   ├── train2014.zip
│   │   ├── val2014
│   │   └── val2014.zip
│   ├── coco2017
│   ├── coco2017_feats
│   ├── datasets.zip
│   ├── okvqa
│   │   ├── mscoco_train2014_annotations.json
│   │   ├── mscoco_val2014_annotations.json
│   │   ├── OpenEnded_mscoco_train2014_questions.json
│   │   └── OpenEnded_mscoco_val2014_questions.json
│   ├── old_data
│   │   ├── coco2014
│   │   └── coco2014_feats
│   ├── Untitled.ipynb
│   └── vqav2
│       ├── v2_mscoco_train2014_annotations.json
│       ├── v2_mscoco_val2014_annotations.json
│       ├── v2_OpenEnded_mscoco_train2014_questions.json
│       ├── v2_OpenEnded_mscoco_val2014_questions.json
│       ├── v2valvg_no_ok_annotations.json
│       ├── v2valvg_no_ok_questions.json
│       ├── vg_annotations.json
│       └── vg_questions.json
├── environment.yml
├── evaluation
│   ├── ans_punct.py
│   ├── aok_utils
│   │   ├── eval_predictions.py
│   │   ├── load_aokvqa.py
│   │   ├── __pycache__
│   │   └── remap_predictions.py
│   ├── aokvqa_evaluate.py
│   ├── okvqa_evaluate.py
│   ├── __pycache__
│   │   ├── ans_punct.cpython-39.pyc
│   │   ├── aokvqa_evaluate.cpython-39.pyc
│   │   └── okvqa_evaluate.cpython-39.pyc
│   └── vqa_utils
│       ├── __pycache__
│       ├── vqaEval.py
│       └── vqa.py
├── LICENSE
├── main.py
├── misc
│   ├── framework.png
│   └── tree.txt
├── outputs
│   ├── ckpts
│   │   ├── okvqa_finetune_1
│   │   ├── okvqa_heuristics_1
│   │   └── okvqa_pretrain_1
│   ├── logs
│   │   ├── okvqa_finetune_1
│   │   └── okvqa_pretrain_1
│   └── results
│       ├── okvqa_finetune_1
│       └── okvqa_heuristics_1
├── preds
├── prophet
│   ├── __init__.py
│   ├── __pycache__
│   │   └── __init__.cpython-39.pyc
│   ├── stage1
│   │   ├── finetune.py
│   │   ├── heuristics.py
│   │   ├── model
│   │   ├── pretrain.py
│   │   ├── __pycache__
│   │   └── utils
│   └── stage2
│       ├── prompt.py
│       └── utils
├── README.md
├── scripts
│   ├── evaluate_file.sh
│   ├── evaluate_model.sh
│   ├── extract_img_feats.sh
│   ├── finetune.sh
│   ├── heuristics_gen.sh
│   ├── pretrain.sh
│   └── prompt.sh
├── --task
├── tools
│   ├── extract_img_feats.py
│   ├── __pycache__
│   │   └── transforms.cpython-39.pyc
│   └── transforms.py
└── Untitled.ipynb

Accuracy does not increased

I have trained on the custom dataset, During the training model loss decreased but accuracy remained the same at Zero.
Image

当我在训练stage1时预训练、微调和生成候选答案时报了一样的错 TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType 请问该怎么解决呢

Loading common data...
== Total image number: 123287
Traceback (most recent call last):
File "/root/autodl-tmp/prophet/main.py", line 40, in
runner.run()
File "/root/autodl-tmp/prophet/prophet/stage1/pretrain.py", line 160, in run
common_data = CommonData(self.__C)
File "/root/autodl-tmp/prophet/prophet/stage1/utils/load_data.py", line 55, in init
self.tokenizer = AutoTokenizer.from_pretrained(__C.BERT_VERSION)
File "/root/miniconda3/envs/prophet/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 676, in from_pretrained
return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/root/miniconda3/envs/prophet/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1804, in from_pretrained
return cls._from_pretrained(
File "/root/miniconda3/envs/prophet/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1834, in _from_pretrained
slow_tokenizer = (cls.slow_tokenizer_class)._from_pretrained(
File "/root/miniconda3/envs/prophet/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1959, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/root/miniconda3/envs/prophet/lib/python3.9/site-packages/transformers/models/bert/tokenization_bert.py", line 213, in init
if not os.path.isfile(vocab_file):
File "/root/miniconda3/envs/prophet/lib/python3.9/genericpath.py", line 30, in isfile
st = os.stat(path)
TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType

Replacing GPT-3 with other academic LLMs

Thank you so much for your excellent work!

I have a minor problem about the LLM selection. Have you tried other academic LLM models, e.g., LLAMA, to replace GPT-3? Will it make a big performance difference? Thanks!

Best regards

1

1

assets

May I ask if the files in the assets folder were created by myself? If they were generated by code, please let me learn the code. Thank you.

预训练MCAN模型和在okvqa上微调是一起的吗?应该先预训练MCAN,再去微调。

At this stage, we train an improved MCAN model through pretraning on VQA v2 and finetuning on target dataset. Take OK-VQA for example, run pretraining step with commands:

$ bash scripts/pretrain.sh --task ok --version okvqa_pretrain_1 --gpu 0

预训练MCAN模型和在okvqa上微调是一起的吗?应该先预训练MCAN,再去微调。
但是,上面的脚本,task是ok,是不是MCAN已经预训练结束了,然后在okvqa上进行微调?还是,预训练和微调放在一起执行呢?
是否应该有单独进行mcan预训练的执行脚本代码?然后,保存checkpoint,提供下载,然后再去okvqa上进行微调?

skip step 1 and go directly to step 2

Step 1 takes a long time. You mentioned in your introduction that we can skip step 1 and go directly to step 2 based on the answer_aware_example_okvqa.json and candidates_okvqa.json you provided, right?

Checkpoints Availability

Hello! I was wondering when/if the checkpoints for prophet would be made publically available? Thanks in advance :)

当我运行stage2的命令时,显示错误连接openAI,这个是什么原因呢?

Loaded dataset size: 9009, top10 accuracy: 91.81, top1 accuracy: 86.54
Loaded dataset size: 5046, top10 accuracy: 79.83, top1 accuracy: 53.05

Working... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/5046 0:06:31/-:--:--(--s/iter)<class 'openai.error.APIConnectionError'> Error communicating with OpenAI
retrying...

OpenAI-Api Cost

l want know the cost of the whole process of the use of openai-api,or the cost of one use test of the model.
I'm afraid I can't afford the expense of my experiments.
Probably cost is enough.l hope can get answer,thank you very much.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.