airsplay / vokenization Goto Github PK

View Code? Open in Web Editor NEW

186.0 186.0 22.0 201 KB

PyTorch code for EMNLP 2020 Paper "Vokenization: Improving Language Understanding with Visual Supervision"

License: MIT License

Shell 7.97% Python 92.03%

vokenization's People

Contributors

Stargazers

Watchers

vokenization's Issues

A problem about ClassificationHead in the model.py

Thanks for your great work! And I notice that you utilized a non-linear layer with GELU and a LayerNorm operation and a linear layer called decoder as the voken classification head, which is different from the way mentioned in the paper. In the paper, it is a softmax layer following a linear layer. Did they perform similarly or just cuz I misunderstand it?

How to fine-tune on SQUAD?

Hi authors,

Thanks for sharing this nice work! I'm big fan of it.

I notice the paper reported results on SQUAD datasets, but I did not find relevant code in this repo. Could you please shed a light on how to evaluate on SQUAD datasets?

Any reply will be appreciated!

Sincerely,
Peter

About the finetune accuracy

Hi, thanks for your interesting work.
I met a problem when I tried to finetune the model. I loaded the released pretrained model BERT_base model, and finetuned it on GLUE using the given finetuning scripts, I got only 69.08 on QQP and 31.82 on MNLI. Therefore, I wondered (1) Is the GLUE performance reported in the paper exactly the performance after three-epoch finetuning or you just picked up the highest during finetuning? (2) For the pretrained model, did you just use the model at the last iteration or you picked up one during the pretraining process?
Thanks in advance.

processed_tokens variable is incorrectly calculated in function 'vokenization.vokenize_corpus_mp::reducer'

RuntimeError: stack expects each tensor to be equal size, but got [14] at entry 0 and [12] at entry 1

Training of Epoch 0: GPU 0 will process 591616 data in 2311 iterations.
0%| | 0/2311 [00:31<?, ?it/s]
Traceback (most recent call last):
File "xmatching/main.py", line 313, in
main()
File "xmatching/main.py", line 43, in main
mp.spawn(train, nprocs=args.gpus, args=(args,))
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 200, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 158, in start_processes
while not context.join():
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 119, in join
raise Exception(msg)
Exception:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 20, in _wrap
fn(i, *args)
File "/home/zhanhaolan/codes/vokenization/xmatching/main.py", line 233, in train
for i, (uid, lang_input, visn_input) in enumerate(tqdm.tqdm(train_loader, disable=(gpu!=0))):
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/tqdm/std.py", line 1167, in iter
for obj in iterable:
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 363, in next
data = self._next_data()
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 989, in _next_data
return self._process_data(data)
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1014, in _process_data
data.reraise()
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 185, in _worker_loop
data = fetcher.fetch(index)
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
return self.collate_fn(data)
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 84, in default_collate
return [default_collate(samples) for samples in transposed]
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 84, in
return [default_collate(samples) for samples in transposed]
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 84, in default_collate
return [default_collate(samples) for samples in transposed]
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 84, in
return [default_collate(samples) for samples in transposed]
File "/home/zhanhaolan/anaconda3/envs/torch1.4py37/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 55, in default_collate
return torch.stack(batch, 0, out=out)
RuntimeError: stack expects each tensor to be equal size, but got [14] at entry 0 and [12] at entry 1

Hi, Do you have any idea about this issue?

Error in extract_vision_keys

Thanks for your awesome work, but there is something wrong when I run the extract_vision_keys.py:
in line 139:
dset = h5_file.create_dataset("keys", (len(saved_img_paths), last_dim)) in extract_vision_keys.py, it reports
OverflowError: can't convert negative value to hsize_t.
Do you have any idea about solving this error? I would be quite appreciated for your early reply.

Question: Tokenizer used for RoBERTa model?

I'm trying to integrate Vokenization with BERTScore and I'd like to get clarification on which tokenizer is being used for the pretrained RoBERTa + VLM model. Is it roberta-base or bert-base-uncased?

AttributeError in extract_vision_keys.py

There is an AttributeError when running the extract_vision_keys.py:

Load model from snap/xmatching/bert_resnext/BEST.pth.model. Traceback (most recent call last): File "vokenization/extract_vision_keys.py", line 259, in <module> joint_model = torch.load(args.load_dir + '/BEST.pth.model') File "/data/home/cwq/.miniconda3/envs/xlm/lib/python3.6/site-packages/torch/serialization.py", line 595, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File "/data/home/cwq/.miniconda3/envs/xlm/lib/python3.6/site-packages/torch/serialization.py", line 774, in _legacy_load result = unpickler.load() AttributeError: Can't get attribute 'gelu' on <module 'transformers.modeling_bert' from '/data/home/cwq/.miniconda3/envs/xlm/lib/python3.6/site-packages/transformers/modeling_bert.py'>

As required in the requirements.txt, my transformers version is 3.3.0.
Is it a version problem? Or something should be added to the modeling_bert.py?
Do you have any idea about this issue? Thank you.

Revokenization

Hi,

Thank you for your great work.
I'm trying to train a RoBERTa-based VLM model on my own dataset. I plan to use your pre-trained vokenizer provided here. But, in my understanding, your pre-trained vokenizer or cross-modal matching model is trained based on bert-base-uncased tokenizer and thus, I need revokenization to obtain the vokenization result, which will be used in the RoBERTa-based VLM training.
While I found the code revokenize_corpus_mp.py, it seems that Revokenizer is not used there (it's imported though).

Could you update the code accordingly? or could you give me some instruction for how to use the Revokenizer class in the code?
Also, I would appreciate it if you could provide a sample bash script like mpvokenize_wiki.bash to apply revokenization to a given dataset.

Thanks.

Do vokens need when fine-tuning?

Hello，
Thank you for your great works!
I wonder if vokens are only used for pre-training? That is to say, when fine-tuning, one can directly use the pre-trained model you have released as a BERT-like model. And there is no need to do vokenization on downstream data.

the release of a trained BERT+Voken-cls model

Hi,

the work is so interesting. I was wondering that if you have a plan to release a trained BERT+Voken-cls model?

Thanks!

about voken regression and voken constrastive

I have two questions.

(1) I notice that in your code

vokenization/vlm/model.py

Line 238 in 5601b79

# Build Loss functions

, you design three loss function voken classification, voken regression and voken constrastive. But you only report "voken classification" in paper, maybe you find "voken regression and voken constrastive" both don't work or even harm model performance after trials? Is my guess correct ? (Because image features are far different from language embeddings. )

(2) What's the intuition that voken classification loss can improve model performance ? I suspect that different words with similar semantic will have same voken labels and voken classification loss will optimize their similarity. What is your opinion？Could you give me some intuition from your views?

airsplay / vokenization Goto Github PK

vokenization's People

Contributors

Stargazers

Watchers

Forkers

vokenization's Issues

Recommend Projects

Recommend Topics

Recommend Org