Comments (15)
请问这个问题解决了吗?我也遇到了这个问题
from mega.
我也是,请问解决了吗
from mega.
使用rb读取即可
from mega.
使用rb读取即可
from mega.
使用rb读取即可
from mega.
@bingoarthur Hello,我对源码进行如上修改后可以解决utf-8报错了,但运行train命令后似乎会重复载入一次模型,随后内存占满(我是32GB的内存)导致MemeryError,请问您有遇到类似的问题吗?
2022-12-07 11:01:38,958 - root - INFO - Arguments:
2022-12-07 11:01:38,958 - root - INFO - pretrain_path: bert-base-uncased
2022-12-07 11:01:38,958 - root - INFO - ckpt: MEGA
2022-12-07 11:01:38,958 - root - INFO - pooler: entity
2022-12-07 11:01:38,958 - root - INFO - only_test: False
2022-12-07 11:01:38,958 - root - INFO - mask_entity: False
2022-12-07 11:01:38,958 - root - INFO - metric: micro_f1
2022-12-07 11:01:38,958 - root - INFO - dataset: ours
2022-12-07 11:01:38,958 - root - INFO - train_file: .\benchmark\ours\txt/ours_train.txt
2022-12-07 11:01:38,958 - root - INFO - val_file: .\benchmark\ours\txt/ours_val.txt
2022-12-07 11:01:38,959 - root - INFO - test_file: .\benchmark\ours\txt/ours_test.txt
2022-12-07 11:01:38,959 - root - INFO - rel2id_file: .\benchmark\ours\ours_rel2id.json
2022-12-07 11:01:38,959 - root - INFO - batch_size: 32
2022-12-07 11:01:38,959 - root - INFO - lr: 2e-05
2022-12-07 11:01:38,959 - root - INFO - max_length: 128
2022-12-07 11:01:38,959 - root - INFO - max_epoch: 10
2022-12-07 11:01:38,959 - root - INFO - rel_num: 1
2022-12-07 11:01:38,959 - root - INFO - pic_train_file: .\benchmark\ours\imgSG/train
2022-12-07 11:01:38,959 - root - INFO - pic_val_file: .\benchmark\ours\imgSG/val
2022-12-07 11:01:38,959 - root - INFO - pic_test_file: .\benchmark\ours\imgSG/test
2022-12-07 11:01:38,959 - root - INFO - rel_train_file: .\benchmark\ours\rel_1/train
2022-12-07 11:01:38,959 - root - INFO - rel_val_file: .\benchmark\ours\rel_1/val
2022-12-07 11:01:38,959 - root - INFO - rel_test_file: .\benchmark\ours\rel_1/test
2022-12-07 11:01:38,960 - root - INFO - Loading BERT pre-trained checkpoint.
2022-12-07 11:01:38,960 - transformers.configuration_utils - INFO - loading configuration file bert-base-uncased\config.json
2022-12-07 11:01:38,961 - transformers.configuration_utils - INFO - Model config BertConfig {
"attention_probs_dropout_prob": 0.1,
"gradient_checkpointing": false,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"layer_norm_eps": 1e-12,
"max_position_embeddings": 512,
"model_type": "bert",
"num_attention_heads": 12,
"num_hidden_layers": 12,
"pad_token_id": 0,
"type_vocab_size": 2,
"vocab_size": 30522
}
2022-12-07 11:01:38,961 - transformers.modeling_utils - INFO - loading weights file bert-base-uncased\pytorch_model.bin
2022-12-07 11:01:42,688 - transformers.modeling_utils - INFO - All model checkpoint weights were used when initializing BertModel.
2022-12-07 11:01:42,688 - transformers.modeling_utils - INFO - All the weights of BertModel were initialized from the model checkpoint at bert-base-uncased.
If your task is similar to the task the model of the ckeckpoint was trained on, you can already use BertModel for predictions without further training.
2022-12-07 11:01:44,022 - transformers.tokenization_utils_base - INFO - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at C:\Users\XJA/.cache\torch\transformers\26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
2022-12-07 11:01:46,404 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_train.txt with 12247 lines and 23 relations.
2022-12-07 11:02:34,042 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/train with 7356 objects.
2022-12-07 11:02:36,835 - root - INFO - Loaded sentence RE dataset aligned weights with 7647 samples.
2022-12-07 11:02:37,215 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_val.txt with 1624 lines and 23 relations.
2022-12-07 11:02:43,524 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/val with 931 objects.
2022-12-07 11:02:43,728 - root - INFO - Loaded sentence RE dataset aligned weights with 962 samples.
2022-12-07 11:02:44,126 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_test.txt with 1614 lines and 23 relations.
2022-12-07 11:02:52,278 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/test with 914 objects.
2022-12-07 11:02:52,535 - root - INFO - Loaded sentence RE dataset aligned weights with 955 samples.
2022-12-07 11:02:54,933 - root - INFO - === Epoch 0 train ===
0%| | 0/383 [00:00<?, ?it/s]2022-12-07 11:02:56,232 - root - INFO - Arguments:
2022-12-07 11:02:56,232 - root - INFO - pretrain_path: bert-base-uncased
2022-12-07 11:02:56,232 - root - INFO - ckpt: MEGA
2022-12-07 11:02:56,232 - root - INFO - pooler: entity
2022-12-07 11:02:56,232 - root - INFO - only_test: False
2022-12-07 11:02:56,232 - root - INFO - mask_entity: False
2022-12-07 11:02:56,232 - root - INFO - metric: micro_f1
2022-12-07 11:02:56,232 - root - INFO - dataset: ours
2022-12-07 11:02:56,232 - root - INFO - train_file: .\benchmark\ours\txt/ours_train.txt
2022-12-07 11:02:56,232 - root - INFO - val_file: .\benchmark\ours\txt/ours_val.txt
2022-12-07 11:02:56,232 - root - INFO - test_file: .\benchmark\ours\txt/ours_test.txt
2022-12-07 11:02:56,232 - root - INFO - rel2id_file: .\benchmark\ours\ours_rel2id.json
2022-12-07 11:02:56,232 - root - INFO - batch_size: 32
2022-12-07 11:02:56,232 - root - INFO - lr: 2e-05
2022-12-07 11:02:56,232 - root - INFO - max_length: 128
2022-12-07 11:02:56,232 - root - INFO - max_epoch: 10
2022-12-07 11:02:56,232 - root - INFO - rel_num: 1
2022-12-07 11:02:56,232 - root - INFO - pic_train_file: .\benchmark\ours\imgSG/train
2022-12-07 11:02:56,232 - root - INFO - pic_val_file: .\benchmark\ours\imgSG/val
2022-12-07 11:02:56,232 - root - INFO - pic_test_file: .\benchmark\ours\imgSG/test
2022-12-07 11:02:56,232 - root - INFO - rel_train_file: .\benchmark\ours\rel_1/train
2022-12-07 11:02:56,232 - root - INFO - rel_val_file: .\benchmark\ours\rel_1/val
2022-12-07 11:02:56,232 - root - INFO - rel_test_file: .\benchmark\ours\rel_1/test
2022-12-07 11:02:56,233 - root - INFO - Loading BERT pre-trained checkpoint.
2022-12-07 11:02:56,233 - transformers.configuration_utils - INFO - loading configuration file bert-base-uncased\config.json
2022-12-07 11:02:56,233 - transformers.configuration_utils - INFO - Model config BertConfig {
"attention_probs_dropout_prob": 0.1,
"gradient_checkpointing": false,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"layer_norm_eps": 1e-12,
"max_position_embeddings": 512,
"model_type": "bert",
"num_attention_heads": 12,
"num_hidden_layers": 12,
"pad_token_id": 0,
"type_vocab_size": 2,
"vocab_size": 30522
}
2022-12-07 11:02:56,233 - transformers.modeling_utils - INFO - loading weights file bert-base-uncased\pytorch_model.bin
2022-12-07 11:02:57,786 - transformers.modeling_utils - INFO - All model checkpoint weights were used when initializing BertModel.
2022-12-07 11:02:57,786 - transformers.modeling_utils - INFO - All the weights of BertModel were initialized from the model checkpoint at bert-base-uncased.
If your task is similar to the task the model of the ckeckpoint was trained on, you can already use BertModel for predictions without further training.
2022-12-07 11:02:58,948 - transformers.tokenization_utils_base - INFO - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at C:\Users\XJA/.cache\torch\transformers\26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
2022-12-07 11:02:59,820 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_train.txt with 12247 lines and 23 relations.
2022-12-07 11:04:02,508 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/train with 7356 objects.
2022-12-07 11:04:09,209 - root - INFO - Loaded sentence RE dataset aligned weights with 7647 samples.
2022-12-07 11:04:09,422 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_val.txt with 1624 lines and 23 relations.
2022-12-07 11:04:18,070 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/val with 931 objects.
2022-12-07 11:04:18,487 - root - INFO - Loaded sentence RE dataset aligned weights with 962 samples.
2022-12-07 11:04:18,758 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_test.txt with 1614 lines and 23 relations.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 125, in _main
prepare(preparation_data)
File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 265, in run_path
return _run_module_code(code, init_globals, run_name,
File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 97, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "D:\Code\pyProject\Mega\example\train.py", line 112, in <module>
framework = opennre.framework.SentenceRE(
File "D:\Code\pyProject\Mega\opennre\framework\sentence_re.py", line 56, in __init__
self.test_loader = SentenceRELoader(
File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 204, in SentenceRELoader
dataset = SentenceREDataset(text_path=text_path, rel_path=rel_path, pic_path=pic_path,
File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 60, in __init__
feature_list = [float(feature) for feature in feature_list]
File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 60, in <listcomp>
feature_list = [float(feature) for feature in feature_list]
MemoryError
from mega.
@bingoarthur Hello,我对源码进行如上修改后可以解决utf-8报错了,但运行train命令后似乎会重复载入一次模型,随后内存占满(我是32GB的内存)导致MemeryError,请问您有遇到类似的问题吗?
2022-12-07 11:01:38,958 - root - INFO - Arguments: 2022-12-07 11:01:38,958 - root - INFO - pretrain_path: bert-base-uncased 2022-12-07 11:01:38,958 - root - INFO - ckpt: MEGA 2022-12-07 11:01:38,958 - root - INFO - pooler: entity 2022-12-07 11:01:38,958 - root - INFO - only_test: False 2022-12-07 11:01:38,958 - root - INFO - mask_entity: False 2022-12-07 11:01:38,958 - root - INFO - metric: micro_f1 2022-12-07 11:01:38,958 - root - INFO - dataset: ours 2022-12-07 11:01:38,958 - root - INFO - train_file: .\benchmark\ours\txt/ours_train.txt 2022-12-07 11:01:38,958 - root - INFO - val_file: .\benchmark\ours\txt/ours_val.txt 2022-12-07 11:01:38,959 - root - INFO - test_file: .\benchmark\ours\txt/ours_test.txt 2022-12-07 11:01:38,959 - root - INFO - rel2id_file: .\benchmark\ours\ours_rel2id.json 2022-12-07 11:01:38,959 - root - INFO - batch_size: 32 2022-12-07 11:01:38,959 - root - INFO - lr: 2e-05 2022-12-07 11:01:38,959 - root - INFO - max_length: 128 2022-12-07 11:01:38,959 - root - INFO - max_epoch: 10 2022-12-07 11:01:38,959 - root - INFO - rel_num: 1 2022-12-07 11:01:38,959 - root - INFO - pic_train_file: .\benchmark\ours\imgSG/train 2022-12-07 11:01:38,959 - root - INFO - pic_val_file: .\benchmark\ours\imgSG/val 2022-12-07 11:01:38,959 - root - INFO - pic_test_file: .\benchmark\ours\imgSG/test 2022-12-07 11:01:38,959 - root - INFO - rel_train_file: .\benchmark\ours\rel_1/train 2022-12-07 11:01:38,959 - root - INFO - rel_val_file: .\benchmark\ours\rel_1/val 2022-12-07 11:01:38,959 - root - INFO - rel_test_file: .\benchmark\ours\rel_1/test 2022-12-07 11:01:38,960 - root - INFO - Loading BERT pre-trained checkpoint. 2022-12-07 11:01:38,960 - transformers.configuration_utils - INFO - loading configuration file bert-base-uncased\config.json 2022-12-07 11:01:38,961 - transformers.configuration_utils - INFO - Model config BertConfig { "attention_probs_dropout_prob": 0.1, "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "type_vocab_size": 2, "vocab_size": 30522 } 2022-12-07 11:01:38,961 - transformers.modeling_utils - INFO - loading weights file bert-base-uncased\pytorch_model.bin 2022-12-07 11:01:42,688 - transformers.modeling_utils - INFO - All model checkpoint weights were used when initializing BertModel. 2022-12-07 11:01:42,688 - transformers.modeling_utils - INFO - All the weights of BertModel were initialized from the model checkpoint at bert-base-uncased. If your task is similar to the task the model of the ckeckpoint was trained on, you can already use BertModel for predictions without further training. 2022-12-07 11:01:44,022 - transformers.tokenization_utils_base - INFO - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at C:\Users\XJA/.cache\torch\transformers\26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084 2022-12-07 11:01:46,404 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_train.txt with 12247 lines and 23 relations. 2022-12-07 11:02:34,042 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/train with 7356 objects. 2022-12-07 11:02:36,835 - root - INFO - Loaded sentence RE dataset aligned weights with 7647 samples. 2022-12-07 11:02:37,215 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_val.txt with 1624 lines and 23 relations. 2022-12-07 11:02:43,524 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/val with 931 objects. 2022-12-07 11:02:43,728 - root - INFO - Loaded sentence RE dataset aligned weights with 962 samples. 2022-12-07 11:02:44,126 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_test.txt with 1614 lines and 23 relations. 2022-12-07 11:02:52,278 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/test with 914 objects. 2022-12-07 11:02:52,535 - root - INFO - Loaded sentence RE dataset aligned weights with 955 samples. 2022-12-07 11:02:54,933 - root - INFO - === Epoch 0 train === 0%| | 0/383 [00:00<?, ?it/s]2022-12-07 11:02:56,232 - root - INFO - Arguments: 2022-12-07 11:02:56,232 - root - INFO - pretrain_path: bert-base-uncased 2022-12-07 11:02:56,232 - root - INFO - ckpt: MEGA 2022-12-07 11:02:56,232 - root - INFO - pooler: entity 2022-12-07 11:02:56,232 - root - INFO - only_test: False 2022-12-07 11:02:56,232 - root - INFO - mask_entity: False 2022-12-07 11:02:56,232 - root - INFO - metric: micro_f1 2022-12-07 11:02:56,232 - root - INFO - dataset: ours 2022-12-07 11:02:56,232 - root - INFO - train_file: .\benchmark\ours\txt/ours_train.txt 2022-12-07 11:02:56,232 - root - INFO - val_file: .\benchmark\ours\txt/ours_val.txt 2022-12-07 11:02:56,232 - root - INFO - test_file: .\benchmark\ours\txt/ours_test.txt 2022-12-07 11:02:56,232 - root - INFO - rel2id_file: .\benchmark\ours\ours_rel2id.json 2022-12-07 11:02:56,232 - root - INFO - batch_size: 32 2022-12-07 11:02:56,232 - root - INFO - lr: 2e-05 2022-12-07 11:02:56,232 - root - INFO - max_length: 128 2022-12-07 11:02:56,232 - root - INFO - max_epoch: 10 2022-12-07 11:02:56,232 - root - INFO - rel_num: 1 2022-12-07 11:02:56,232 - root - INFO - pic_train_file: .\benchmark\ours\imgSG/train 2022-12-07 11:02:56,232 - root - INFO - pic_val_file: .\benchmark\ours\imgSG/val 2022-12-07 11:02:56,232 - root - INFO - pic_test_file: .\benchmark\ours\imgSG/test 2022-12-07 11:02:56,232 - root - INFO - rel_train_file: .\benchmark\ours\rel_1/train 2022-12-07 11:02:56,232 - root - INFO - rel_val_file: .\benchmark\ours\rel_1/val 2022-12-07 11:02:56,232 - root - INFO - rel_test_file: .\benchmark\ours\rel_1/test 2022-12-07 11:02:56,233 - root - INFO - Loading BERT pre-trained checkpoint. 2022-12-07 11:02:56,233 - transformers.configuration_utils - INFO - loading configuration file bert-base-uncased\config.json 2022-12-07 11:02:56,233 - transformers.configuration_utils - INFO - Model config BertConfig { "attention_probs_dropout_prob": 0.1, "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "type_vocab_size": 2, "vocab_size": 30522 } 2022-12-07 11:02:56,233 - transformers.modeling_utils - INFO - loading weights file bert-base-uncased\pytorch_model.bin 2022-12-07 11:02:57,786 - transformers.modeling_utils - INFO - All model checkpoint weights were used when initializing BertModel. 2022-12-07 11:02:57,786 - transformers.modeling_utils - INFO - All the weights of BertModel were initialized from the model checkpoint at bert-base-uncased. If your task is similar to the task the model of the ckeckpoint was trained on, you can already use BertModel for predictions without further training. 2022-12-07 11:02:58,948 - transformers.tokenization_utils_base - INFO - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at C:\Users\XJA/.cache\torch\transformers\26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084 2022-12-07 11:02:59,820 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_train.txt with 12247 lines and 23 relations. 2022-12-07 11:04:02,508 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/train with 7356 objects. 2022-12-07 11:04:09,209 - root - INFO - Loaded sentence RE dataset aligned weights with 7647 samples. 2022-12-07 11:04:09,422 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_val.txt with 1624 lines and 23 relations. 2022-12-07 11:04:18,070 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/val with 931 objects. 2022-12-07 11:04:18,487 - root - INFO - Loaded sentence RE dataset aligned weights with 962 samples. 2022-12-07 11:04:18,758 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_test.txt with 1614 lines and 23 relations. Traceback (most recent call last): File "<string>", line 1, in <module> File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 116, in spawn_main exitcode = _main(fd, parent_sentinel) File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 125, in _main prepare(preparation_data) File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 236, in prepare _fixup_main_from_path(data['init_main_from_path']) File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path main_content = runpy.run_path(main_path, File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 265, in run_path return _run_module_code(code, init_globals, run_name, File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 97, in _run_module_code _run_code(code, mod_globals, init_globals, File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "D:\Code\pyProject\Mega\example\train.py", line 112, in <module> framework = opennre.framework.SentenceRE( File "D:\Code\pyProject\Mega\opennre\framework\sentence_re.py", line 56, in __init__ self.test_loader = SentenceRELoader( File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 204, in SentenceRELoader dataset = SentenceREDataset(text_path=text_path, rel_path=rel_path, pic_path=pic_path, File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 60, in __init__ feature_list = [float(feature) for feature in feature_list] File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 60, in <listcomp> feature_list = [float(feature) for feature in feature_list] MemoryError
train.py按照如上更改试试看,memoryerror应该还是反复读取数据的问题
from mega.
@bingoarthur 谢谢!我尝试过了,请问这个修改前后有什么区别吗,我这里似乎只有路径分隔符变成了'/',然后问题依然没有解决。
D:\Code\pyProject\Mega\venv\Scripts\python.exe D:\Code\pyProject\Mega\example\train.py --dataset ours --max_epoch 10 --batch_size 32 --metric micro_f1 --lr 2e-5 --ckpt MEGA
修改前:
.\benchmark\ours\txt/ours_train.txt
.\benchmark\ours\txt/ours_val.txt
.\benchmark\ours\txt/ours_test.txt
.\benchmark\ours\imgSG/train
.\benchmark\ours\imgSG/val
.\benchmark\ours\imgSG/test
.\benchmark\ours\rel_1/train
.\benchmark\ours\rel_1/val
.\benchmark\ours\rel_1/test
.\benchmark\ours\ours_rel2id.json
修改后:
./benchmark/ours/txt/ours_train.txt
./benchmark/ours/txt/ours_val.txt
./benchmark/ours/txt/ours_test.txt
./benchmark/ours/imgSG/train
./benchmark/ours/imgSG/val
./benchmark/ours/imgSG/test
./benchmark/ours/rel_1/train
./benchmark/ours/rel_1/val
./benchmark/ours/rel_1/test
./benchmark/ours/ours_rel2id.json
2022-12-07 15:28:44,106 - root - INFO - Arguments:
2022-12-07 15:28:44,106 - root - INFO - pretrain_path: bert-base-uncased
2022-12-07 15:28:44,106 - root - INFO - ckpt: MEGA
2022-12-07 15:28:44,106 - root - INFO - pooler: entity
2022-12-07 15:28:44,106 - root - INFO - only_test: False
2022-12-07 15:28:44,106 - root - INFO - mask_entity: False
2022-12-07 15:28:44,106 - root - INFO - metric: micro_f1
2022-12-07 15:28:44,106 - root - INFO - dataset: ours
2022-12-07 15:28:44,106 - root - INFO - train_file: ./benchmark/ours/txt/ours_train.txt
2022-12-07 15:28:44,106 - root - INFO - val_file: ./benchmark/ours/txt/ours_val.txt
2022-12-07 15:28:44,106 - root - INFO - test_file: ./benchmark/ours/txt/ours_test.txt
2022-12-07 15:28:44,106 - root - INFO - rel2id_file: ./benchmark/ours/ours_rel2id.json
2022-12-07 15:28:44,106 - root - INFO - batch_size: 32
2022-12-07 15:28:44,106 - root - INFO - lr: 2e-05
2022-12-07 15:28:44,106 - root - INFO - max_length: 128
2022-12-07 15:28:44,106 - root - INFO - max_epoch: 10
2022-12-07 15:28:44,106 - root - INFO - rel_num: 1
2022-12-07 15:28:44,107 - root - INFO - pic_train_file: ./benchmark/ours/imgSG/train
2022-12-07 15:28:44,107 - root - INFO - pic_val_file: ./benchmark/ours/imgSG/val
2022-12-07 15:28:44,107 - root - INFO - pic_test_file: ./benchmark/ours/imgSG/test
2022-12-07 15:28:44,107 - root - INFO - rel_train_file: ./benchmark/ours/rel_1/train
2022-12-07 15:28:44,107 - root - INFO - rel_val_file: ./benchmark/ours/rel_1/val
2022-12-07 15:28:44,107 - root - INFO - rel_test_file: ./benchmark/ours/rel_1/test
2022-12-07 15:28:44,107 - root - INFO - Loading BERT pre-trained checkpoint.
2022-12-07 15:28:44,107 - transformers.configuration_utils - INFO - loading configuration file bert-base-uncased\config.json
2022-12-07 15:28:44,107 - transformers.configuration_utils - INFO - Model config BertConfig {
"attention_probs_dropout_prob": 0.1,
"gradient_checkpointing": false,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"layer_norm_eps": 1e-12,
"max_position_embeddings": 512,
"model_type": "bert",
"num_attention_heads": 12,
"num_hidden_layers": 12,
"pad_token_id": 0,
"type_vocab_size": 2,
"vocab_size": 30522
}
2022-12-07 15:28:44,108 - transformers.modeling_utils - INFO - loading weights file bert-base-uncased\pytorch_model.bin
2022-12-07 15:28:45,611 - transformers.modeling_utils - INFO - All model checkpoint weights were used when initializing BertModel.
2022-12-07 15:28:45,611 - transformers.modeling_utils - INFO - All the weights of BertModel were initialized from the model checkpoint at bert-base-uncased.
If your task is similar to the task the model of the ckeckpoint was trained on, you can already use BertModel for predictions without further training.
2022-12-07 15:28:45,637 - transformers.tokenization_utils_base - INFO - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at C:\Users\XJA/.cache\torch\transformers\26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
2022-12-07 15:28:46,531 - root - INFO - Loaded sentence RE dataset ./benchmark/ours/txt/ours_train.txt with 12247 lines and 23 relations.
...(其余部分到MemoryError是一样的)
from mega.
@lentikr 奥,你是不是反复加载的bert,把bert用本地路径试一下
from mega.
@lentikr 奥,你是不是反复加载的bert,把bert用本地路径试一下
似乎不是这个问题,我调试了下,似乎是在进度条tqdm处出现了问题,在图中红框处的代码运行后会跳回train.py函数的起始部分重新运行一遍。很费解。
from mega.
@lentikr 奥,你是不是反复加载的bert,把bert用本地路径试一下
@bingoarthur 方便问一下您那边tqdm库的版本吗?
from mega.
@lentikr tqdm 4.64.1 pyton3.8 pytorch1.12.1
from mega.
本issue可以终结了,使用最开始打不开的图片数据集即可utf-8编码,不要换成可以查看的图片数据集。
from mega.
@bingoarthur Hello,我对源码进行如上修改后可以解决utf-8报错了,但运行train命令后似乎会重复载入一次模型,随后内存占满(我是32GB的内存)导致MemeryError,请问您有遇到类似的问题吗?
2022-12-07 11:01:38,958 - root - INFO - Arguments: 2022-12-07 11:01:38,958 - root - INFO - pretrain_path: bert-base-uncased 2022-12-07 11:01:38,958 - root - INFO - ckpt: MEGA 2022-12-07 11:01:38,958 - root - INFO - pooler: entity 2022-12-07 11:01:38,958 - root - INFO - only_test: False 2022-12-07 11:01:38,958 - root - INFO - mask_entity: False 2022-12-07 11:01:38,958 - root - INFO - metric: micro_f1 2022-12-07 11:01:38,958 - root - INFO - dataset: ours 2022-12-07 11:01:38,958 - root - INFO - train_file: .\benchmark\ours\txt/ours_train.txt 2022-12-07 11:01:38,958 - root - INFO - val_file: .\benchmark\ours\txt/ours_val.txt 2022-12-07 11:01:38,959 - root - INFO - test_file: .\benchmark\ours\txt/ours_test.txt 2022-12-07 11:01:38,959 - root - INFO - rel2id_file: .\benchmark\ours\ours_rel2id.json 2022-12-07 11:01:38,959 - root - INFO - batch_size: 32 2022-12-07 11:01:38,959 - root - INFO - lr: 2e-05 2022-12-07 11:01:38,959 - root - INFO - max_length: 128 2022-12-07 11:01:38,959 - root - INFO - max_epoch: 10 2022-12-07 11:01:38,959 - root - INFO - rel_num: 1 2022-12-07 11:01:38,959 - root - INFO - pic_train_file: .\benchmark\ours\imgSG/train 2022-12-07 11:01:38,959 - root - INFO - pic_val_file: .\benchmark\ours\imgSG/val 2022-12-07 11:01:38,959 - root - INFO - pic_test_file: .\benchmark\ours\imgSG/test 2022-12-07 11:01:38,959 - root - INFO - rel_train_file: .\benchmark\ours\rel_1/train 2022-12-07 11:01:38,959 - root - INFO - rel_val_file: .\benchmark\ours\rel_1/val 2022-12-07 11:01:38,959 - root - INFO - rel_test_file: .\benchmark\ours\rel_1/test 2022-12-07 11:01:38,960 - root - INFO - Loading BERT pre-trained checkpoint. 2022-12-07 11:01:38,960 - transformers.configuration_utils - INFO - loading configuration file bert-base-uncased\config.json 2022-12-07 11:01:38,961 - transformers.configuration_utils - INFO - Model config BertConfig { "attention_probs_dropout_prob": 0.1, "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "type_vocab_size": 2, "vocab_size": 30522 } 2022-12-07 11:01:38,961 - transformers.modeling_utils - INFO - loading weights file bert-base-uncased\pytorch_model.bin 2022-12-07 11:01:42,688 - transformers.modeling_utils - INFO - All model checkpoint weights were used when initializing BertModel. 2022-12-07 11:01:42,688 - transformers.modeling_utils - INFO - All the weights of BertModel were initialized from the model checkpoint at bert-base-uncased. If your task is similar to the task the model of the ckeckpoint was trained on, you can already use BertModel for predictions without further training. 2022-12-07 11:01:44,022 - transformers.tokenization_utils_base - INFO - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at C:\Users\XJA/.cache\torch\transformers\26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084 2022-12-07 11:01:46,404 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_train.txt with 12247 lines and 23 relations. 2022-12-07 11:02:34,042 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/train with 7356 objects. 2022-12-07 11:02:36,835 - root - INFO - Loaded sentence RE dataset aligned weights with 7647 samples. 2022-12-07 11:02:37,215 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_val.txt with 1624 lines and 23 relations. 2022-12-07 11:02:43,524 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/val with 931 objects. 2022-12-07 11:02:43,728 - root - INFO - Loaded sentence RE dataset aligned weights with 962 samples. 2022-12-07 11:02:44,126 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_test.txt with 1614 lines and 23 relations. 2022-12-07 11:02:52,278 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/test with 914 objects. 2022-12-07 11:02:52,535 - root - INFO - Loaded sentence RE dataset aligned weights with 955 samples. 2022-12-07 11:02:54,933 - root - INFO - === Epoch 0 train === 0%| | 0/383 [00:00<?, ?it/s]2022-12-07 11:02:56,232 - root - INFO - Arguments: 2022-12-07 11:02:56,232 - root - INFO - pretrain_path: bert-base-uncased 2022-12-07 11:02:56,232 - root - INFO - ckpt: MEGA 2022-12-07 11:02:56,232 - root - INFO - pooler: entity 2022-12-07 11:02:56,232 - root - INFO - only_test: False 2022-12-07 11:02:56,232 - root - INFO - mask_entity: False 2022-12-07 11:02:56,232 - root - INFO - metric: micro_f1 2022-12-07 11:02:56,232 - root - INFO - dataset: ours 2022-12-07 11:02:56,232 - root - INFO - train_file: .\benchmark\ours\txt/ours_train.txt 2022-12-07 11:02:56,232 - root - INFO - val_file: .\benchmark\ours\txt/ours_val.txt 2022-12-07 11:02:56,232 - root - INFO - test_file: .\benchmark\ours\txt/ours_test.txt 2022-12-07 11:02:56,232 - root - INFO - rel2id_file: .\benchmark\ours\ours_rel2id.json 2022-12-07 11:02:56,232 - root - INFO - batch_size: 32 2022-12-07 11:02:56,232 - root - INFO - lr: 2e-05 2022-12-07 11:02:56,232 - root - INFO - max_length: 128 2022-12-07 11:02:56,232 - root - INFO - max_epoch: 10 2022-12-07 11:02:56,232 - root - INFO - rel_num: 1 2022-12-07 11:02:56,232 - root - INFO - pic_train_file: .\benchmark\ours\imgSG/train 2022-12-07 11:02:56,232 - root - INFO - pic_val_file: .\benchmark\ours\imgSG/val 2022-12-07 11:02:56,232 - root - INFO - pic_test_file: .\benchmark\ours\imgSG/test 2022-12-07 11:02:56,232 - root - INFO - rel_train_file: .\benchmark\ours\rel_1/train 2022-12-07 11:02:56,232 - root - INFO - rel_val_file: .\benchmark\ours\rel_1/val 2022-12-07 11:02:56,232 - root - INFO - rel_test_file: .\benchmark\ours\rel_1/test 2022-12-07 11:02:56,233 - root - INFO - Loading BERT pre-trained checkpoint. 2022-12-07 11:02:56,233 - transformers.configuration_utils - INFO - loading configuration file bert-base-uncased\config.json 2022-12-07 11:02:56,233 - transformers.configuration_utils - INFO - Model config BertConfig { "attention_probs_dropout_prob": 0.1, "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "type_vocab_size": 2, "vocab_size": 30522 } 2022-12-07 11:02:56,233 - transformers.modeling_utils - INFO - loading weights file bert-base-uncased\pytorch_model.bin 2022-12-07 11:02:57,786 - transformers.modeling_utils - INFO - All model checkpoint weights were used when initializing BertModel. 2022-12-07 11:02:57,786 - transformers.modeling_utils - INFO - All the weights of BertModel were initialized from the model checkpoint at bert-base-uncased. If your task is similar to the task the model of the ckeckpoint was trained on, you can already use BertModel for predictions without further training. 2022-12-07 11:02:58,948 - transformers.tokenization_utils_base - INFO - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at C:\Users\XJA/.cache\torch\transformers\26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084 2022-12-07 11:02:59,820 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_train.txt with 12247 lines and 23 relations. 2022-12-07 11:04:02,508 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/train with 7356 objects. 2022-12-07 11:04:09,209 - root - INFO - Loaded sentence RE dataset aligned weights with 7647 samples. 2022-12-07 11:04:09,422 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_val.txt with 1624 lines and 23 relations. 2022-12-07 11:04:18,070 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/val with 931 objects. 2022-12-07 11:04:18,487 - root - INFO - Loaded sentence RE dataset aligned weights with 962 samples. 2022-12-07 11:04:18,758 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_test.txt with 1614 lines and 23 relations. Traceback (most recent call last): File "<string>", line 1, in <module> File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 116, in spawn_main exitcode = _main(fd, parent_sentinel) File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 125, in _main prepare(preparation_data) File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 236, in prepare _fixup_main_from_path(data['init_main_from_path']) File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path main_content = runpy.run_path(main_path, File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 265, in run_path return _run_module_code(code, init_globals, run_name, File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 97, in _run_module_code _run_code(code, mod_globals, init_globals, File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "D:\Code\pyProject\Mega\example\train.py", line 112, in <module> framework = opennre.framework.SentenceRE( File "D:\Code\pyProject\Mega\opennre\framework\sentence_re.py", line 56, in __init__ self.test_loader = SentenceRELoader( File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 204, in SentenceRELoader dataset = SentenceREDataset(text_path=text_path, rel_path=rel_path, pic_path=pic_path, File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 60, in __init__ feature_list = [float(feature) for feature in feature_list] File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 60, in <listcomp> feature_list = [float(feature) for feature in feature_list] MemoryError
from mega.
Related Issues (18)
- 想要一份论文pdf HOT 2
- What is the meaning of the 'RT' field in the TXT file in the dataset?? HOT 2
- What's wrong with the env, why I get "expected str, bytes or os.PathLike object, not NoneType" HOT 1
- there have a place with no code implementation HOT 1
- Paper pdf
- 请问哪部分代码的工作是对数据集的处理?rel_1这一数据的作用是什么呢
- About StructuralAlign.py HOT 2
- 实验设备
- rel_1这个咋来的呀,你们缝合REGAL模块的代码却感觉没缝合好,感觉好多地方都没说清楚。严重怀疑学术能力 HOT 1
- RuntimeError: shape '[-1, 10, 4096]' is invalid for input of size 237568 HOT 3
- Questions HOT 2
- 关于标注的一些问题 HOT 4
- 数据集图片打不开的问题 HOT 3
- 模型ckpt的问题 HOT 2
- ckpt加载失败 HOT 5
- Which part of the code correspond to the graph alignment? HOT 19
- 针对数据集
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mega.