Hello, authors, It's excellent work! And I have a little bug unsolved when I run dodrio-data-gen.py.
The bug seems here
my_loss,my_logit,attentions=my_model(tokens,attention_mask=masks,labels=labels.long(),output_attentions=True)
Using device: cuda
Reusing dataset glue (/root/.cache/huggingface/datasets/glue/sst2/1.0.0/7c99657241149a24692c402a5c3f34d4c9f1df5ac2e4c3759fadea38f6cb29c4)
2. Extracting Attention Weights and Gradients...
Some weights of the model checkpoint at bert-base-uncased were not used when initializing MyBertForSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.weight', 'cls.predictions.decoder.weight', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing MyBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing MyBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of MyBertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.weight', 'cls.predictions.decoder.weight', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
0%| | 0/127 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/home/yukyin/anaconda3/envs/exp381/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/yukyin/anaconda3/envs/exp381/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/root/.vscode-server/extensions/ms-python.python-2022.4.1/pythonFiles/lib/python/debugpy/__main__.py", line 45, in <module>
cli.main()
File "/root/.vscode-server/extensions/ms-python.python-2022.4.1/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 444, in main
run()
File "/root/.vscode-server/extensions/ms-python.python-2022.4.1/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 285, in run_file
runpy.run_path(target_as_str, run_name=compat.force_str("__main__"))
File "/home/yukyin/anaconda3/envs/exp381/lib/python3.8/runpy.py", line 265, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/yukyin/anaconda3/envs/exp381/lib/python3.8/runpy.py", line 97, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/yukyin/anaconda3/envs/exp381/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/yukyin/PycharmProjects/test/xiaoyanghua_new/humor/dodrio/data-generation/dodrio-data-gen.py", line 966, in <module>
output = my_model(input_ids=tokens
File "/home/yukyin/anaconda3/envs/exp381/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/yukyin/anaconda3/envs/exp381/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 1502, in forward
outputs = self.bert(
File "/home/yukyin/anaconda3/envs/exp381/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/yukyin/anaconda3/envs/exp381/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 971, in forward
encoder_outputs = self.encoder(
File "/home/yukyin/anaconda3/envs/exp381/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/yukyin/anaconda3/envs/exp381/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 568, in forward
layer_outputs = layer_module(
File "/home/yukyin/anaconda3/envs/exp381/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
TypeError: forward() takes from 2 to 7 positional arguments but 8 were given
I have tried different versions of transformers,but it doesn't work.