Code Monkey home page Code Monkey logo

Comments (15)

SoTWhat avatar SoTWhat commented on August 21, 2024

请问这个问题解决了吗?我也遇到了这个问题

from mega.

2021wangkai avatar 2021wangkai commented on August 21, 2024

我也是,请问解决了吗

from mega.

bingoarthur avatar bingoarthur commented on August 21, 2024

使用rb读取即可

from mega.

lentikr avatar lentikr commented on August 21, 2024

使用rb读取即可

请问是这样做修改吗?
image

from mega.

bingoarthur avatar bingoarthur commented on August 21, 2024

使用rb读取即可

请问是这样做修改吗? image

333

from mega.

lentikr avatar lentikr commented on August 21, 2024

333

@bingoarthur Hello,我对源码进行如上修改后可以解决utf-8报错了,但运行train命令后似乎会重复载入一次模型,随后内存占满(我是32GB的内存)导致MemeryError,请问您有遇到类似的问题吗?

2022-12-07 11:01:38,958 - root - INFO - Arguments:
2022-12-07 11:01:38,958 - root - INFO -     pretrain_path: bert-base-uncased
2022-12-07 11:01:38,958 - root - INFO -     ckpt: MEGA
2022-12-07 11:01:38,958 - root - INFO -     pooler: entity
2022-12-07 11:01:38,958 - root - INFO -     only_test: False
2022-12-07 11:01:38,958 - root - INFO -     mask_entity: False
2022-12-07 11:01:38,958 - root - INFO -     metric: micro_f1
2022-12-07 11:01:38,958 - root - INFO -     dataset: ours
2022-12-07 11:01:38,958 - root - INFO -     train_file: .\benchmark\ours\txt/ours_train.txt
2022-12-07 11:01:38,958 - root - INFO -     val_file: .\benchmark\ours\txt/ours_val.txt
2022-12-07 11:01:38,959 - root - INFO -     test_file: .\benchmark\ours\txt/ours_test.txt
2022-12-07 11:01:38,959 - root - INFO -     rel2id_file: .\benchmark\ours\ours_rel2id.json
2022-12-07 11:01:38,959 - root - INFO -     batch_size: 32
2022-12-07 11:01:38,959 - root - INFO -     lr: 2e-05
2022-12-07 11:01:38,959 - root - INFO -     max_length: 128
2022-12-07 11:01:38,959 - root - INFO -     max_epoch: 10
2022-12-07 11:01:38,959 - root - INFO -     rel_num: 1
2022-12-07 11:01:38,959 - root - INFO -     pic_train_file: .\benchmark\ours\imgSG/train
2022-12-07 11:01:38,959 - root - INFO -     pic_val_file: .\benchmark\ours\imgSG/val
2022-12-07 11:01:38,959 - root - INFO -     pic_test_file: .\benchmark\ours\imgSG/test
2022-12-07 11:01:38,959 - root - INFO -     rel_train_file: .\benchmark\ours\rel_1/train
2022-12-07 11:01:38,959 - root - INFO -     rel_val_file: .\benchmark\ours\rel_1/val
2022-12-07 11:01:38,959 - root - INFO -     rel_test_file: .\benchmark\ours\rel_1/test
2022-12-07 11:01:38,960 - root - INFO - Loading BERT pre-trained checkpoint.
2022-12-07 11:01:38,960 - transformers.configuration_utils - INFO - loading configuration file bert-base-uncased\config.json
2022-12-07 11:01:38,961 - transformers.configuration_utils - INFO - Model config BertConfig {
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

2022-12-07 11:01:38,961 - transformers.modeling_utils - INFO - loading weights file bert-base-uncased\pytorch_model.bin
2022-12-07 11:01:42,688 - transformers.modeling_utils - INFO - All model checkpoint weights were used when initializing BertModel.

2022-12-07 11:01:42,688 - transformers.modeling_utils - INFO - All the weights of BertModel were initialized from the model checkpoint at bert-base-uncased.
If your task is similar to the task the model of the ckeckpoint was trained on, you can already use BertModel for predictions without further training.
2022-12-07 11:01:44,022 - transformers.tokenization_utils_base - INFO - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at C:\Users\XJA/.cache\torch\transformers\26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
2022-12-07 11:01:46,404 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_train.txt with 12247 lines and 23 relations.
2022-12-07 11:02:34,042 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/train with 7356 objects.
2022-12-07 11:02:36,835 - root - INFO - Loaded sentence RE dataset aligned weights with 7647 samples.
2022-12-07 11:02:37,215 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_val.txt with 1624 lines and 23 relations.
2022-12-07 11:02:43,524 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/val with 931 objects.
2022-12-07 11:02:43,728 - root - INFO - Loaded sentence RE dataset aligned weights with 962 samples.
2022-12-07 11:02:44,126 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_test.txt with 1614 lines and 23 relations.
2022-12-07 11:02:52,278 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/test with 914 objects.
2022-12-07 11:02:52,535 - root - INFO - Loaded sentence RE dataset aligned weights with 955 samples.
2022-12-07 11:02:54,933 - root - INFO - === Epoch 0 train ===
  0%|                                                                                 | 0/383 [00:00<?, ?it/s]2022-12-07 11:02:56,232 - root - INFO - Arguments:
2022-12-07 11:02:56,232 - root - INFO -     pretrain_path: bert-base-uncased
2022-12-07 11:02:56,232 - root - INFO -     ckpt: MEGA
2022-12-07 11:02:56,232 - root - INFO -     pooler: entity
2022-12-07 11:02:56,232 - root - INFO -     only_test: False
2022-12-07 11:02:56,232 - root - INFO -     mask_entity: False
2022-12-07 11:02:56,232 - root - INFO -     metric: micro_f1
2022-12-07 11:02:56,232 - root - INFO -     dataset: ours
2022-12-07 11:02:56,232 - root - INFO -     train_file: .\benchmark\ours\txt/ours_train.txt
2022-12-07 11:02:56,232 - root - INFO -     val_file: .\benchmark\ours\txt/ours_val.txt
2022-12-07 11:02:56,232 - root - INFO -     test_file: .\benchmark\ours\txt/ours_test.txt
2022-12-07 11:02:56,232 - root - INFO -     rel2id_file: .\benchmark\ours\ours_rel2id.json
2022-12-07 11:02:56,232 - root - INFO -     batch_size: 32
2022-12-07 11:02:56,232 - root - INFO -     lr: 2e-05
2022-12-07 11:02:56,232 - root - INFO -     max_length: 128
2022-12-07 11:02:56,232 - root - INFO -     max_epoch: 10
2022-12-07 11:02:56,232 - root - INFO -     rel_num: 1
2022-12-07 11:02:56,232 - root - INFO -     pic_train_file: .\benchmark\ours\imgSG/train
2022-12-07 11:02:56,232 - root - INFO -     pic_val_file: .\benchmark\ours\imgSG/val
2022-12-07 11:02:56,232 - root - INFO -     pic_test_file: .\benchmark\ours\imgSG/test
2022-12-07 11:02:56,232 - root - INFO -     rel_train_file: .\benchmark\ours\rel_1/train
2022-12-07 11:02:56,232 - root - INFO -     rel_val_file: .\benchmark\ours\rel_1/val
2022-12-07 11:02:56,232 - root - INFO -     rel_test_file: .\benchmark\ours\rel_1/test
2022-12-07 11:02:56,233 - root - INFO - Loading BERT pre-trained checkpoint.
2022-12-07 11:02:56,233 - transformers.configuration_utils - INFO - loading configuration file bert-base-uncased\config.json
2022-12-07 11:02:56,233 - transformers.configuration_utils - INFO - Model config BertConfig {
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

2022-12-07 11:02:56,233 - transformers.modeling_utils - INFO - loading weights file bert-base-uncased\pytorch_model.bin
2022-12-07 11:02:57,786 - transformers.modeling_utils - INFO - All model checkpoint weights were used when initializing BertModel.

2022-12-07 11:02:57,786 - transformers.modeling_utils - INFO - All the weights of BertModel were initialized from the model checkpoint at bert-base-uncased.
If your task is similar to the task the model of the ckeckpoint was trained on, you can already use BertModel for predictions without further training.
2022-12-07 11:02:58,948 - transformers.tokenization_utils_base - INFO - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at C:\Users\XJA/.cache\torch\transformers\26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
2022-12-07 11:02:59,820 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_train.txt with 12247 lines and 23 relations.
2022-12-07 11:04:02,508 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/train with 7356 objects.
2022-12-07 11:04:09,209 - root - INFO - Loaded sentence RE dataset aligned weights with 7647 samples.
2022-12-07 11:04:09,422 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_val.txt with 1624 lines and 23 relations.
2022-12-07 11:04:18,070 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/val with 931 objects.
2022-12-07 11:04:18,487 - root - INFO - Loaded sentence RE dataset aligned weights with 962 samples.
2022-12-07 11:04:18,758 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_test.txt with 1614 lines and 23 relations.
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 125, in _main
    prepare(preparation_data)
  File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 265, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "D:\Code\pyProject\Mega\example\train.py", line 112, in <module>
    framework = opennre.framework.SentenceRE(
  File "D:\Code\pyProject\Mega\opennre\framework\sentence_re.py", line 56, in __init__
    self.test_loader = SentenceRELoader(
  File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 204, in SentenceRELoader
    dataset = SentenceREDataset(text_path=text_path, rel_path=rel_path, pic_path=pic_path,
  File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 60, in __init__
    feature_list = [float(feature) for feature in feature_list]
  File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 60, in <listcomp>
    feature_list = [float(feature) for feature in feature_list]
MemoryError

from mega.

bingoarthur avatar bingoarthur commented on August 21, 2024

333

@bingoarthur Hello,我对源码进行如上修改后可以解决utf-8报错了,但运行train命令后似乎会重复载入一次模型,随后内存占满(我是32GB的内存)导致MemeryError,请问您有遇到类似的问题吗?

2022-12-07 11:01:38,958 - root - INFO - Arguments:
2022-12-07 11:01:38,958 - root - INFO -     pretrain_path: bert-base-uncased
2022-12-07 11:01:38,958 - root - INFO -     ckpt: MEGA
2022-12-07 11:01:38,958 - root - INFO -     pooler: entity
2022-12-07 11:01:38,958 - root - INFO -     only_test: False
2022-12-07 11:01:38,958 - root - INFO -     mask_entity: False
2022-12-07 11:01:38,958 - root - INFO -     metric: micro_f1
2022-12-07 11:01:38,958 - root - INFO -     dataset: ours
2022-12-07 11:01:38,958 - root - INFO -     train_file: .\benchmark\ours\txt/ours_train.txt
2022-12-07 11:01:38,958 - root - INFO -     val_file: .\benchmark\ours\txt/ours_val.txt
2022-12-07 11:01:38,959 - root - INFO -     test_file: .\benchmark\ours\txt/ours_test.txt
2022-12-07 11:01:38,959 - root - INFO -     rel2id_file: .\benchmark\ours\ours_rel2id.json
2022-12-07 11:01:38,959 - root - INFO -     batch_size: 32
2022-12-07 11:01:38,959 - root - INFO -     lr: 2e-05
2022-12-07 11:01:38,959 - root - INFO -     max_length: 128
2022-12-07 11:01:38,959 - root - INFO -     max_epoch: 10
2022-12-07 11:01:38,959 - root - INFO -     rel_num: 1
2022-12-07 11:01:38,959 - root - INFO -     pic_train_file: .\benchmark\ours\imgSG/train
2022-12-07 11:01:38,959 - root - INFO -     pic_val_file: .\benchmark\ours\imgSG/val
2022-12-07 11:01:38,959 - root - INFO -     pic_test_file: .\benchmark\ours\imgSG/test
2022-12-07 11:01:38,959 - root - INFO -     rel_train_file: .\benchmark\ours\rel_1/train
2022-12-07 11:01:38,959 - root - INFO -     rel_val_file: .\benchmark\ours\rel_1/val
2022-12-07 11:01:38,959 - root - INFO -     rel_test_file: .\benchmark\ours\rel_1/test
2022-12-07 11:01:38,960 - root - INFO - Loading BERT pre-trained checkpoint.
2022-12-07 11:01:38,960 - transformers.configuration_utils - INFO - loading configuration file bert-base-uncased\config.json
2022-12-07 11:01:38,961 - transformers.configuration_utils - INFO - Model config BertConfig {
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

2022-12-07 11:01:38,961 - transformers.modeling_utils - INFO - loading weights file bert-base-uncased\pytorch_model.bin
2022-12-07 11:01:42,688 - transformers.modeling_utils - INFO - All model checkpoint weights were used when initializing BertModel.

2022-12-07 11:01:42,688 - transformers.modeling_utils - INFO - All the weights of BertModel were initialized from the model checkpoint at bert-base-uncased.
If your task is similar to the task the model of the ckeckpoint was trained on, you can already use BertModel for predictions without further training.
2022-12-07 11:01:44,022 - transformers.tokenization_utils_base - INFO - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at C:\Users\XJA/.cache\torch\transformers\26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
2022-12-07 11:01:46,404 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_train.txt with 12247 lines and 23 relations.
2022-12-07 11:02:34,042 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/train with 7356 objects.
2022-12-07 11:02:36,835 - root - INFO - Loaded sentence RE dataset aligned weights with 7647 samples.
2022-12-07 11:02:37,215 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_val.txt with 1624 lines and 23 relations.
2022-12-07 11:02:43,524 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/val with 931 objects.
2022-12-07 11:02:43,728 - root - INFO - Loaded sentence RE dataset aligned weights with 962 samples.
2022-12-07 11:02:44,126 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_test.txt with 1614 lines and 23 relations.
2022-12-07 11:02:52,278 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/test with 914 objects.
2022-12-07 11:02:52,535 - root - INFO - Loaded sentence RE dataset aligned weights with 955 samples.
2022-12-07 11:02:54,933 - root - INFO - === Epoch 0 train ===
  0%|                                                                                 | 0/383 [00:00<?, ?it/s]2022-12-07 11:02:56,232 - root - INFO - Arguments:
2022-12-07 11:02:56,232 - root - INFO -     pretrain_path: bert-base-uncased
2022-12-07 11:02:56,232 - root - INFO -     ckpt: MEGA
2022-12-07 11:02:56,232 - root - INFO -     pooler: entity
2022-12-07 11:02:56,232 - root - INFO -     only_test: False
2022-12-07 11:02:56,232 - root - INFO -     mask_entity: False
2022-12-07 11:02:56,232 - root - INFO -     metric: micro_f1
2022-12-07 11:02:56,232 - root - INFO -     dataset: ours
2022-12-07 11:02:56,232 - root - INFO -     train_file: .\benchmark\ours\txt/ours_train.txt
2022-12-07 11:02:56,232 - root - INFO -     val_file: .\benchmark\ours\txt/ours_val.txt
2022-12-07 11:02:56,232 - root - INFO -     test_file: .\benchmark\ours\txt/ours_test.txt
2022-12-07 11:02:56,232 - root - INFO -     rel2id_file: .\benchmark\ours\ours_rel2id.json
2022-12-07 11:02:56,232 - root - INFO -     batch_size: 32
2022-12-07 11:02:56,232 - root - INFO -     lr: 2e-05
2022-12-07 11:02:56,232 - root - INFO -     max_length: 128
2022-12-07 11:02:56,232 - root - INFO -     max_epoch: 10
2022-12-07 11:02:56,232 - root - INFO -     rel_num: 1
2022-12-07 11:02:56,232 - root - INFO -     pic_train_file: .\benchmark\ours\imgSG/train
2022-12-07 11:02:56,232 - root - INFO -     pic_val_file: .\benchmark\ours\imgSG/val
2022-12-07 11:02:56,232 - root - INFO -     pic_test_file: .\benchmark\ours\imgSG/test
2022-12-07 11:02:56,232 - root - INFO -     rel_train_file: .\benchmark\ours\rel_1/train
2022-12-07 11:02:56,232 - root - INFO -     rel_val_file: .\benchmark\ours\rel_1/val
2022-12-07 11:02:56,232 - root - INFO -     rel_test_file: .\benchmark\ours\rel_1/test
2022-12-07 11:02:56,233 - root - INFO - Loading BERT pre-trained checkpoint.
2022-12-07 11:02:56,233 - transformers.configuration_utils - INFO - loading configuration file bert-base-uncased\config.json
2022-12-07 11:02:56,233 - transformers.configuration_utils - INFO - Model config BertConfig {
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

2022-12-07 11:02:56,233 - transformers.modeling_utils - INFO - loading weights file bert-base-uncased\pytorch_model.bin
2022-12-07 11:02:57,786 - transformers.modeling_utils - INFO - All model checkpoint weights were used when initializing BertModel.

2022-12-07 11:02:57,786 - transformers.modeling_utils - INFO - All the weights of BertModel were initialized from the model checkpoint at bert-base-uncased.
If your task is similar to the task the model of the ckeckpoint was trained on, you can already use BertModel for predictions without further training.
2022-12-07 11:02:58,948 - transformers.tokenization_utils_base - INFO - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at C:\Users\XJA/.cache\torch\transformers\26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
2022-12-07 11:02:59,820 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_train.txt with 12247 lines and 23 relations.
2022-12-07 11:04:02,508 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/train with 7356 objects.
2022-12-07 11:04:09,209 - root - INFO - Loaded sentence RE dataset aligned weights with 7647 samples.
2022-12-07 11:04:09,422 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_val.txt with 1624 lines and 23 relations.
2022-12-07 11:04:18,070 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/val with 931 objects.
2022-12-07 11:04:18,487 - root - INFO - Loaded sentence RE dataset aligned weights with 962 samples.
2022-12-07 11:04:18,758 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_test.txt with 1614 lines and 23 relations.
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 125, in _main
    prepare(preparation_data)
  File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 265, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "D:\Code\pyProject\Mega\example\train.py", line 112, in <module>
    framework = opennre.framework.SentenceRE(
  File "D:\Code\pyProject\Mega\opennre\framework\sentence_re.py", line 56, in __init__
    self.test_loader = SentenceRELoader(
  File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 204, in SentenceRELoader
    dataset = SentenceREDataset(text_path=text_path, rel_path=rel_path, pic_path=pic_path,
  File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 60, in __init__
    feature_list = [float(feature) for feature in feature_list]
  File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 60, in <listcomp>
    feature_list = [float(feature) for feature in feature_list]
MemoryError

3332
train.py按照如上更改试试看,memoryerror应该还是反复读取数据的问题

from mega.

lentikr avatar lentikr commented on August 21, 2024

3332 train.py按照如上更改试试看,memoryerror应该还是反复读取数据的问题

@bingoarthur 谢谢!我尝试过了,请问这个修改前后有什么区别吗,我这里似乎只有路径分隔符变成了'/',然后问题依然没有解决。

D:\Code\pyProject\Mega\venv\Scripts\python.exe D:\Code\pyProject\Mega\example\train.py --dataset ours --max_epoch 10 --batch_size 32 --metric micro_f1 --lr 2e-5 --ckpt MEGA 
修改前:
.\benchmark\ours\txt/ours_train.txt
.\benchmark\ours\txt/ours_val.txt
.\benchmark\ours\txt/ours_test.txt
.\benchmark\ours\imgSG/train
.\benchmark\ours\imgSG/val
.\benchmark\ours\imgSG/test
.\benchmark\ours\rel_1/train
.\benchmark\ours\rel_1/val
.\benchmark\ours\rel_1/test
.\benchmark\ours\ours_rel2id.json
修改后:
./benchmark/ours/txt/ours_train.txt
./benchmark/ours/txt/ours_val.txt
./benchmark/ours/txt/ours_test.txt
./benchmark/ours/imgSG/train
./benchmark/ours/imgSG/val
./benchmark/ours/imgSG/test
./benchmark/ours/rel_1/train
./benchmark/ours/rel_1/val
./benchmark/ours/rel_1/test
./benchmark/ours/ours_rel2id.json
2022-12-07 15:28:44,106 - root - INFO - Arguments:
2022-12-07 15:28:44,106 - root - INFO -     pretrain_path: bert-base-uncased
2022-12-07 15:28:44,106 - root - INFO -     ckpt: MEGA
2022-12-07 15:28:44,106 - root - INFO -     pooler: entity
2022-12-07 15:28:44,106 - root - INFO -     only_test: False
2022-12-07 15:28:44,106 - root - INFO -     mask_entity: False
2022-12-07 15:28:44,106 - root - INFO -     metric: micro_f1
2022-12-07 15:28:44,106 - root - INFO -     dataset: ours
2022-12-07 15:28:44,106 - root - INFO -     train_file: ./benchmark/ours/txt/ours_train.txt
2022-12-07 15:28:44,106 - root - INFO -     val_file: ./benchmark/ours/txt/ours_val.txt
2022-12-07 15:28:44,106 - root - INFO -     test_file: ./benchmark/ours/txt/ours_test.txt
2022-12-07 15:28:44,106 - root - INFO -     rel2id_file: ./benchmark/ours/ours_rel2id.json
2022-12-07 15:28:44,106 - root - INFO -     batch_size: 32
2022-12-07 15:28:44,106 - root - INFO -     lr: 2e-05
2022-12-07 15:28:44,106 - root - INFO -     max_length: 128
2022-12-07 15:28:44,106 - root - INFO -     max_epoch: 10
2022-12-07 15:28:44,106 - root - INFO -     rel_num: 1
2022-12-07 15:28:44,107 - root - INFO -     pic_train_file: ./benchmark/ours/imgSG/train
2022-12-07 15:28:44,107 - root - INFO -     pic_val_file: ./benchmark/ours/imgSG/val
2022-12-07 15:28:44,107 - root - INFO -     pic_test_file: ./benchmark/ours/imgSG/test
2022-12-07 15:28:44,107 - root - INFO -     rel_train_file: ./benchmark/ours/rel_1/train
2022-12-07 15:28:44,107 - root - INFO -     rel_val_file: ./benchmark/ours/rel_1/val
2022-12-07 15:28:44,107 - root - INFO -     rel_test_file: ./benchmark/ours/rel_1/test
2022-12-07 15:28:44,107 - root - INFO - Loading BERT pre-trained checkpoint.
2022-12-07 15:28:44,107 - transformers.configuration_utils - INFO - loading configuration file bert-base-uncased\config.json
2022-12-07 15:28:44,107 - transformers.configuration_utils - INFO - Model config BertConfig {
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

2022-12-07 15:28:44,108 - transformers.modeling_utils - INFO - loading weights file bert-base-uncased\pytorch_model.bin
2022-12-07 15:28:45,611 - transformers.modeling_utils - INFO - All model checkpoint weights were used when initializing BertModel.

2022-12-07 15:28:45,611 - transformers.modeling_utils - INFO - All the weights of BertModel were initialized from the model checkpoint at bert-base-uncased.
If your task is similar to the task the model of the ckeckpoint was trained on, you can already use BertModel for predictions without further training.
2022-12-07 15:28:45,637 - transformers.tokenization_utils_base - INFO - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at C:\Users\XJA/.cache\torch\transformers\26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
2022-12-07 15:28:46,531 - root - INFO - Loaded sentence RE dataset ./benchmark/ours/txt/ours_train.txt with 12247 lines and 23 relations.
...(其余部分到MemoryError是一样的)

from mega.

bingoarthur avatar bingoarthur commented on August 21, 2024

@lentikr 奥,你是不是反复加载的bert,把bert用本地路径试一下
222
333
444

from mega.

lentikr avatar lentikr commented on August 21, 2024

@lentikr 奥,你是不是反复加载的bert,把bert用本地路径试一下 222 333 444

似乎不是这个问题,我调试了下,似乎是在进度条tqdm处出现了问题,在图中红框处的代码运行后会跳回train.py函数的起始部分重新运行一遍。很费解。
image

from mega.

lentikr avatar lentikr commented on August 21, 2024

@lentikr 奥,你是不是反复加载的bert,把bert用本地路径试一下 222 333 444

@bingoarthur 方便问一下您那边tqdm库的版本吗?

from mega.

bingoarthur avatar bingoarthur commented on August 21, 2024

@lentikr tqdm 4.64.1 pyton3.8 pytorch1.12.1

from mega.

lentikr avatar lentikr commented on August 21, 2024

本issue可以终结了,使用最开始打不开的图片数据集即可utf-8编码,不要换成可以查看的图片数据集。

from mega.

zhumying avatar zhumying commented on August 21, 2024

333

@bingoarthur Hello,我对源码进行如上修改后可以解决utf-8报错了,但运行train命令后似乎会重复载入一次模型,随后内存占满(我是32GB的内存)导致MemeryError,请问您有遇到类似的问题吗?

2022-12-07 11:01:38,958 - root - INFO - Arguments:
2022-12-07 11:01:38,958 - root - INFO -     pretrain_path: bert-base-uncased
2022-12-07 11:01:38,958 - root - INFO -     ckpt: MEGA
2022-12-07 11:01:38,958 - root - INFO -     pooler: entity
2022-12-07 11:01:38,958 - root - INFO -     only_test: False
2022-12-07 11:01:38,958 - root - INFO -     mask_entity: False
2022-12-07 11:01:38,958 - root - INFO -     metric: micro_f1
2022-12-07 11:01:38,958 - root - INFO -     dataset: ours
2022-12-07 11:01:38,958 - root - INFO -     train_file: .\benchmark\ours\txt/ours_train.txt
2022-12-07 11:01:38,958 - root - INFO -     val_file: .\benchmark\ours\txt/ours_val.txt
2022-12-07 11:01:38,959 - root - INFO -     test_file: .\benchmark\ours\txt/ours_test.txt
2022-12-07 11:01:38,959 - root - INFO -     rel2id_file: .\benchmark\ours\ours_rel2id.json
2022-12-07 11:01:38,959 - root - INFO -     batch_size: 32
2022-12-07 11:01:38,959 - root - INFO -     lr: 2e-05
2022-12-07 11:01:38,959 - root - INFO -     max_length: 128
2022-12-07 11:01:38,959 - root - INFO -     max_epoch: 10
2022-12-07 11:01:38,959 - root - INFO -     rel_num: 1
2022-12-07 11:01:38,959 - root - INFO -     pic_train_file: .\benchmark\ours\imgSG/train
2022-12-07 11:01:38,959 - root - INFO -     pic_val_file: .\benchmark\ours\imgSG/val
2022-12-07 11:01:38,959 - root - INFO -     pic_test_file: .\benchmark\ours\imgSG/test
2022-12-07 11:01:38,959 - root - INFO -     rel_train_file: .\benchmark\ours\rel_1/train
2022-12-07 11:01:38,959 - root - INFO -     rel_val_file: .\benchmark\ours\rel_1/val
2022-12-07 11:01:38,959 - root - INFO -     rel_test_file: .\benchmark\ours\rel_1/test
2022-12-07 11:01:38,960 - root - INFO - Loading BERT pre-trained checkpoint.
2022-12-07 11:01:38,960 - transformers.configuration_utils - INFO - loading configuration file bert-base-uncased\config.json
2022-12-07 11:01:38,961 - transformers.configuration_utils - INFO - Model config BertConfig {
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

2022-12-07 11:01:38,961 - transformers.modeling_utils - INFO - loading weights file bert-base-uncased\pytorch_model.bin
2022-12-07 11:01:42,688 - transformers.modeling_utils - INFO - All model checkpoint weights were used when initializing BertModel.

2022-12-07 11:01:42,688 - transformers.modeling_utils - INFO - All the weights of BertModel were initialized from the model checkpoint at bert-base-uncased.
If your task is similar to the task the model of the ckeckpoint was trained on, you can already use BertModel for predictions without further training.
2022-12-07 11:01:44,022 - transformers.tokenization_utils_base - INFO - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at C:\Users\XJA/.cache\torch\transformers\26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
2022-12-07 11:01:46,404 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_train.txt with 12247 lines and 23 relations.
2022-12-07 11:02:34,042 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/train with 7356 objects.
2022-12-07 11:02:36,835 - root - INFO - Loaded sentence RE dataset aligned weights with 7647 samples.
2022-12-07 11:02:37,215 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_val.txt with 1624 lines and 23 relations.
2022-12-07 11:02:43,524 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/val with 931 objects.
2022-12-07 11:02:43,728 - root - INFO - Loaded sentence RE dataset aligned weights with 962 samples.
2022-12-07 11:02:44,126 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_test.txt with 1614 lines and 23 relations.
2022-12-07 11:02:52,278 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/test with 914 objects.
2022-12-07 11:02:52,535 - root - INFO - Loaded sentence RE dataset aligned weights with 955 samples.
2022-12-07 11:02:54,933 - root - INFO - === Epoch 0 train ===
  0%|                                                                                 | 0/383 [00:00<?, ?it/s]2022-12-07 11:02:56,232 - root - INFO - Arguments:
2022-12-07 11:02:56,232 - root - INFO -     pretrain_path: bert-base-uncased
2022-12-07 11:02:56,232 - root - INFO -     ckpt: MEGA
2022-12-07 11:02:56,232 - root - INFO -     pooler: entity
2022-12-07 11:02:56,232 - root - INFO -     only_test: False
2022-12-07 11:02:56,232 - root - INFO -     mask_entity: False
2022-12-07 11:02:56,232 - root - INFO -     metric: micro_f1
2022-12-07 11:02:56,232 - root - INFO -     dataset: ours
2022-12-07 11:02:56,232 - root - INFO -     train_file: .\benchmark\ours\txt/ours_train.txt
2022-12-07 11:02:56,232 - root - INFO -     val_file: .\benchmark\ours\txt/ours_val.txt
2022-12-07 11:02:56,232 - root - INFO -     test_file: .\benchmark\ours\txt/ours_test.txt
2022-12-07 11:02:56,232 - root - INFO -     rel2id_file: .\benchmark\ours\ours_rel2id.json
2022-12-07 11:02:56,232 - root - INFO -     batch_size: 32
2022-12-07 11:02:56,232 - root - INFO -     lr: 2e-05
2022-12-07 11:02:56,232 - root - INFO -     max_length: 128
2022-12-07 11:02:56,232 - root - INFO -     max_epoch: 10
2022-12-07 11:02:56,232 - root - INFO -     rel_num: 1
2022-12-07 11:02:56,232 - root - INFO -     pic_train_file: .\benchmark\ours\imgSG/train
2022-12-07 11:02:56,232 - root - INFO -     pic_val_file: .\benchmark\ours\imgSG/val
2022-12-07 11:02:56,232 - root - INFO -     pic_test_file: .\benchmark\ours\imgSG/test
2022-12-07 11:02:56,232 - root - INFO -     rel_train_file: .\benchmark\ours\rel_1/train
2022-12-07 11:02:56,232 - root - INFO -     rel_val_file: .\benchmark\ours\rel_1/val
2022-12-07 11:02:56,232 - root - INFO -     rel_test_file: .\benchmark\ours\rel_1/test
2022-12-07 11:02:56,233 - root - INFO - Loading BERT pre-trained checkpoint.
2022-12-07 11:02:56,233 - transformers.configuration_utils - INFO - loading configuration file bert-base-uncased\config.json
2022-12-07 11:02:56,233 - transformers.configuration_utils - INFO - Model config BertConfig {
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

2022-12-07 11:02:56,233 - transformers.modeling_utils - INFO - loading weights file bert-base-uncased\pytorch_model.bin
2022-12-07 11:02:57,786 - transformers.modeling_utils - INFO - All model checkpoint weights were used when initializing BertModel.

2022-12-07 11:02:57,786 - transformers.modeling_utils - INFO - All the weights of BertModel were initialized from the model checkpoint at bert-base-uncased.
If your task is similar to the task the model of the ckeckpoint was trained on, you can already use BertModel for predictions without further training.
2022-12-07 11:02:58,948 - transformers.tokenization_utils_base - INFO - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at C:\Users\XJA/.cache\torch\transformers\26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
2022-12-07 11:02:59,820 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_train.txt with 12247 lines and 23 relations.
2022-12-07 11:04:02,508 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/train with 7356 objects.
2022-12-07 11:04:09,209 - root - INFO - Loaded sentence RE dataset aligned weights with 7647 samples.
2022-12-07 11:04:09,422 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_val.txt with 1624 lines and 23 relations.
2022-12-07 11:04:18,070 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/val with 931 objects.
2022-12-07 11:04:18,487 - root - INFO - Loaded sentence RE dataset aligned weights with 962 samples.
2022-12-07 11:04:18,758 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_test.txt with 1614 lines and 23 relations.
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 125, in _main
    prepare(preparation_data)
  File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 265, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "D:\Code\pyProject\Mega\example\train.py", line 112, in <module>
    framework = opennre.framework.SentenceRE(
  File "D:\Code\pyProject\Mega\opennre\framework\sentence_re.py", line 56, in __init__
    self.test_loader = SentenceRELoader(
  File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 204, in SentenceRELoader
    dataset = SentenceREDataset(text_path=text_path, rel_path=rel_path, pic_path=pic_path,
  File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 60, in __init__
    feature_list = [float(feature) for feature in feature_list]
  File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 60, in <listcomp>
    feature_list = [float(feature) for feature in feature_list]
MemoryError

image
请问你有遇到过这个问题吗??

from mega.

Related Issues (18)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.