<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clip

使用rb读取即可请问是这样做修改吗？ <a target="_blank

使用rb读取即可请问是这样做修改吗？ <a targe

<a target="_blank" rel="noopener noreferrer nofollow" href="https://user-

<a target="_blank" rel="noopener noreferrer nofollow" href="

<a target="_blank" rel="noopener noreferrer nofollow" href="https://user-

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

加载dataset时，读取图片特征报错,about thecharm/mega

Comments (15)

SoTWhat commented on August 21, 2024

请问这个问题解决了吗？我也遇到了这个问题

from mega.

2021wangkai commented on August 21, 2024

我也是，请问解决了吗

from mega.

bingoarthur commented on August 21, 2024

使用rb读取即可

from mega.

lentikr commented on August 21, 2024

使用rb读取即可

请问是这样做修改吗？

from mega.

bingoarthur commented on August 21, 2024

使用rb读取即可

请问是这样做修改吗？

from mega.

lentikr commented on August 21, 2024

@bingoarthur Hello，我对源码进行如上修改后可以解决utf-8报错了，但运行train命令后似乎会重复载入一次模型，随后内存占满（我是32GB的内存）导致MemeryError，请问您有遇到类似的问题吗？

2022-12-07 11:01:38,958 - root - INFO - Arguments:
2022-12-07 11:01:38,958 - root - INFO -     pretrain_path: bert-base-uncased
2022-12-07 11:01:38,958 - root - INFO -     ckpt: MEGA
2022-12-07 11:01:38,958 - root - INFO -     pooler: entity
2022-12-07 11:01:38,958 - root - INFO -     only_test: False
2022-12-07 11:01:38,958 - root - INFO -     mask_entity: False
2022-12-07 11:01:38,958 - root - INFO -     metric: micro_f1
2022-12-07 11:01:38,958 - root - INFO -     dataset: ours
2022-12-07 11:01:38,958 - root - INFO -     train_file: .\benchmark\ours\txt/ours_train.txt
2022-12-07 11:01:38,958 - root - INFO -     val_file: .\benchmark\ours\txt/ours_val.txt
2022-12-07 11:01:38,959 - root - INFO -     test_file: .\benchmark\ours\txt/ours_test.txt
2022-12-07 11:01:38,959 - root - INFO -     rel2id_file: .\benchmark\ours\ours_rel2id.json
2022-12-07 11:01:38,959 - root - INFO -     batch_size: 32
2022-12-07 11:01:38,959 - root - INFO -     lr: 2e-05
2022-12-07 11:01:38,959 - root - INFO -     max_length: 128
2022-12-07 11:01:38,959 - root - INFO -     max_epoch: 10
2022-12-07 11:01:38,959 - root - INFO -     rel_num: 1
2022-12-07 11:01:38,959 - root - INFO -     pic_train_file: .\benchmark\ours\imgSG/train
2022-12-07 11:01:38,959 - root - INFO -     pic_val_file: .\benchmark\ours\imgSG/val
2022-12-07 11:01:38,959 - root - INFO -     pic_test_file: .\benchmark\ours\imgSG/test
2022-12-07 11:01:38,959 - root - INFO -     rel_train_file: .\benchmark\ours\rel_1/train
2022-12-07 11:01:38,959 - root - INFO -     rel_val_file: .\benchmark\ours\rel_1/val
2022-12-07 11:01:38,959 - root - INFO -     rel_test_file: .\benchmark\ours\rel_1/test
2022-12-07 11:01:38,960 - root - INFO - Loading BERT pre-trained checkpoint.
2022-12-07 11:01:38,960 - transformers.configuration_utils - INFO - loading configuration file bert-base-uncased\config.json
2022-12-07 11:01:38,961 - transformers.configuration_utils - INFO - Model config BertConfig {
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

2022-12-07 11:01:38,961 - transformers.modeling_utils - INFO - loading weights file bert-base-uncased\pytorch_model.bin
2022-12-07 11:01:42,688 - transformers.modeling_utils - INFO - All model checkpoint weights were used when initializing BertModel.

2022-12-07 11:01:42,688 - transformers.modeling_utils - INFO - All the weights of BertModel were initialized from the model checkpoint at bert-base-uncased.
If your task is similar to the task the model of the ckeckpoint was trained on, you can already use BertModel for predictions without further training.
2022-12-07 11:01:44,022 - transformers.tokenization_utils_base - INFO - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at C:\Users\XJA/.cache\torch\transformers\26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
2022-12-07 11:01:46,404 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_train.txt with 12247 lines and 23 relations.
2022-12-07 11:02:34,042 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/train with 7356 objects.
2022-12-07 11:02:36,835 - root - INFO - Loaded sentence RE dataset aligned weights with 7647 samples.
2022-12-07 11:02:37,215 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_val.txt with 1624 lines and 23 relations.
2022-12-07 11:02:43,524 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/val with 931 objects.
2022-12-07 11:02:43,728 - root - INFO - Loaded sentence RE dataset aligned weights with 962 samples.
2022-12-07 11:02:44,126 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_test.txt with 1614 lines and 23 relations.
2022-12-07 11:02:52,278 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/test with 914 objects.
2022-12-07 11:02:52,535 - root - INFO - Loaded sentence RE dataset aligned weights with 955 samples.
2022-12-07 11:02:54,933 - root - INFO - === Epoch 0 train ===
  0%|                                                                                 | 0/383 [00:00<?, ?it/s]2022-12-07 11:02:56,232 - root - INFO - Arguments:
2022-12-07 11:02:56,232 - root - INFO -     pretrain_path: bert-base-uncased
2022-12-07 11:02:56,232 - root - INFO -     ckpt: MEGA
2022-12-07 11:02:56,232 - root - INFO -     pooler: entity
2022-12-07 11:02:56,232 - root - INFO -     only_test: False
2022-12-07 11:02:56,232 - root - INFO -     mask_entity: False
2022-12-07 11:02:56,232 - root - INFO -     metric: micro_f1
2022-12-07 11:02:56,232 - root - INFO -     dataset: ours
2022-12-07 11:02:56,232 - root - INFO -     train_file: .\benchmark\ours\txt/ours_train.txt
2022-12-07 11:02:56,232 - root - INFO -     val_file: .\benchmark\ours\txt/ours_val.txt
2022-12-07 11:02:56,232 - root - INFO -     test_file: .\benchmark\ours\txt/ours_test.txt
2022-12-07 11:02:56,232 - root - INFO -     rel2id_file: .\benchmark\ours\ours_rel2id.json
2022-12-07 11:02:56,232 - root - INFO -     batch_size: 32
2022-12-07 11:02:56,232 - root - INFO -     lr: 2e-05
2022-12-07 11:02:56,232 - root - INFO -     max_length: 128
2022-12-07 11:02:56,232 - root - INFO -     max_epoch: 10
2022-12-07 11:02:56,232 - root - INFO -     rel_num: 1
2022-12-07 11:02:56,232 - root - INFO -     pic_train_file: .\benchmark\ours\imgSG/train
2022-12-07 11:02:56,232 - root - INFO -     pic_val_file: .\benchmark\ours\imgSG/val
2022-12-07 11:02:56,232 - root - INFO -     pic_test_file: .\benchmark\ours\imgSG/test
2022-12-07 11:02:56,232 - root - INFO -     rel_train_file: .\benchmark\ours\rel_1/train
2022-12-07 11:02:56,232 - root - INFO -     rel_val_file: .\benchmark\ours\rel_1/val
2022-12-07 11:02:56,232 - root - INFO -     rel_test_file: .\benchmark\ours\rel_1/test
2022-12-07 11:02:56,233 - root - INFO - Loading BERT pre-trained checkpoint.
2022-12-07 11:02:56,233 - transformers.configuration_utils - INFO - loading configuration file bert-base-uncased\config.json
2022-12-07 11:02:56,233 - transformers.configuration_utils - INFO - Model config BertConfig {
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

2022-12-07 11:02:56,233 - transformers.modeling_utils - INFO - loading weights file bert-base-uncased\pytorch_model.bin
2022-12-07 11:02:57,786 - transformers.modeling_utils - INFO - All model checkpoint weights were used when initializing BertModel.

2022-12-07 11:02:57,786 - transformers.modeling_utils - INFO - All the weights of BertModel were initialized from the model checkpoint at bert-base-uncased.
If your task is similar to the task the model of the ckeckpoint was trained on, you can already use BertModel for predictions without further training.
2022-12-07 11:02:58,948 - transformers.tokenization_utils_base - INFO - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at C:\Users\XJA/.cache\torch\transformers\26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
2022-12-07 11:02:59,820 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_train.txt with 12247 lines and 23 relations.
2022-12-07 11:04:02,508 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/train with 7356 objects.
2022-12-07 11:04:09,209 - root - INFO - Loaded sentence RE dataset aligned weights with 7647 samples.
2022-12-07 11:04:09,422 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_val.txt with 1624 lines and 23 relations.
2022-12-07 11:04:18,070 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/val with 931 objects.
2022-12-07 11:04:18,487 - root - INFO - Loaded sentence RE dataset aligned weights with 962 samples.
2022-12-07 11:04:18,758 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_test.txt with 1614 lines and 23 relations.
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 125, in _main
    prepare(preparation_data)
  File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 265, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "D:\Code\pyProject\Mega\example\train.py", line 112, in <module>
    framework = opennre.framework.SentenceRE(
  File "D:\Code\pyProject\Mega\opennre\framework\sentence_re.py", line 56, in __init__
    self.test_loader = SentenceRELoader(
  File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 204, in SentenceRELoader
    dataset = SentenceREDataset(text_path=text_path, rel_path=rel_path, pic_path=pic_path,
  File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 60, in __init__
    feature_list = [float(feature) for feature in feature_list]
  File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 60, in <listcomp>
    feature_list = [float(feature) for feature in feature_list]
MemoryError

from mega.

bingoarthur commented on August 21, 2024

2022-12-07 11:01:38,958 - root - INFO - Arguments:
2022-12-07 11:01:38,958 - root - INFO -     pretrain_path: bert-base-uncased
2022-12-07 11:01:38,958 - root - INFO -     ckpt: MEGA
2022-12-07 11:01:38,958 - root - INFO -     pooler: entity
2022-12-07 11:01:38,958 - root - INFO -     only_test: False
2022-12-07 11:01:38,958 - root - INFO -     mask_entity: False
2022-12-07 11:01:38,958 - root - INFO -     metric: micro_f1
2022-12-07 11:01:38,958 - root - INFO -     dataset: ours
2022-12-07 11:01:38,958 - root - INFO -     train_file: .\benchmark\ours\txt/ours_train.txt
2022-12-07 11:01:38,958 - root - INFO -     val_file: .\benchmark\ours\txt/ours_val.txt
2022-12-07 11:01:38,959 - root - INFO -     test_file: .\benchmark\ours\txt/ours_test.txt
2022-12-07 11:01:38,959 - root - INFO -     rel2id_file: .\benchmark\ours\ours_rel2id.json
2022-12-07 11:01:38,959 - root - INFO -     batch_size: 32
2022-12-07 11:01:38,959 - root - INFO -     lr: 2e-05
2022-12-07 11:01:38,959 - root - INFO -     max_length: 128
2022-12-07 11:01:38,959 - root - INFO -     max_epoch: 10
2022-12-07 11:01:38,959 - root - INFO -     rel_num: 1
2022-12-07 11:01:38,959 - root - INFO -     pic_train_file: .\benchmark\ours\imgSG/train
2022-12-07 11:01:38,959 - root - INFO -     pic_val_file: .\benchmark\ours\imgSG/val
2022-12-07 11:01:38,959 - root - INFO -     pic_test_file: .\benchmark\ours\imgSG/test
2022-12-07 11:01:38,959 - root - INFO -     rel_train_file: .\benchmark\ours\rel_1/train
2022-12-07 11:01:38,959 - root - INFO -     rel_val_file: .\benchmark\ours\rel_1/val
2022-12-07 11:01:38,959 - root - INFO -     rel_test_file: .\benchmark\ours\rel_1/test
2022-12-07 11:01:38,960 - root - INFO - Loading BERT pre-trained checkpoint.
2022-12-07 11:01:38,960 - transformers.configuration_utils - INFO - loading configuration file bert-base-uncased\config.json
2022-12-07 11:01:38,961 - transformers.configuration_utils - INFO - Model config BertConfig {
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

2022-12-07 11:01:38,961 - transformers.modeling_utils - INFO - loading weights file bert-base-uncased\pytorch_model.bin
2022-12-07 11:01:42,688 - transformers.modeling_utils - INFO - All model checkpoint weights were used when initializing BertModel.

2022-12-07 11:01:42,688 - transformers.modeling_utils - INFO - All the weights of BertModel were initialized from the model checkpoint at bert-base-uncased.
If your task is similar to the task the model of the ckeckpoint was trained on, you can already use BertModel for predictions without further training.
2022-12-07 11:01:44,022 - transformers.tokenization_utils_base - INFO - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at C:\Users\XJA/.cache\torch\transformers\26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
2022-12-07 11:01:46,404 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_train.txt with 12247 lines and 23 relations.
2022-12-07 11:02:34,042 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/train with 7356 objects.
2022-12-07 11:02:36,835 - root - INFO - Loaded sentence RE dataset aligned weights with 7647 samples.
2022-12-07 11:02:37,215 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_val.txt with 1624 lines and 23 relations.
2022-12-07 11:02:43,524 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/val with 931 objects.
2022-12-07 11:02:43,728 - root - INFO - Loaded sentence RE dataset aligned weights with 962 samples.
2022-12-07 11:02:44,126 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_test.txt with 1614 lines and 23 relations.
2022-12-07 11:02:52,278 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/test with 914 objects.
2022-12-07 11:02:52,535 - root - INFO - Loaded sentence RE dataset aligned weights with 955 samples.
2022-12-07 11:02:54,933 - root - INFO - === Epoch 0 train ===
  0%|                                                                                 | 0/383 [00:00<?, ?it/s]2022-12-07 11:02:56,232 - root - INFO - Arguments:
2022-12-07 11:02:56,232 - root - INFO -     pretrain_path: bert-base-uncased
2022-12-07 11:02:56,232 - root - INFO -     ckpt: MEGA
2022-12-07 11:02:56,232 - root - INFO -     pooler: entity
2022-12-07 11:02:56,232 - root - INFO -     only_test: False
2022-12-07 11:02:56,232 - root - INFO -     mask_entity: False
2022-12-07 11:02:56,232 - root - INFO -     metric: micro_f1
2022-12-07 11:02:56,232 - root - INFO -     dataset: ours
2022-12-07 11:02:56,232 - root - INFO -     train_file: .\benchmark\ours\txt/ours_train.txt
2022-12-07 11:02:56,232 - root - INFO -     val_file: .\benchmark\ours\txt/ours_val.txt
2022-12-07 11:02:56,232 - root - INFO -     test_file: .\benchmark\ours\txt/ours_test.txt
2022-12-07 11:02:56,232 - root - INFO -     rel2id_file: .\benchmark\ours\ours_rel2id.json
2022-12-07 11:02:56,232 - root - INFO -     batch_size: 32
2022-12-07 11:02:56,232 - root - INFO -     lr: 2e-05
2022-12-07 11:02:56,232 - root - INFO -     max_length: 128
2022-12-07 11:02:56,232 - root - INFO -     max_epoch: 10
2022-12-07 11:02:56,232 - root - INFO -     rel_num: 1
2022-12-07 11:02:56,232 - root - INFO -     pic_train_file: .\benchmark\ours\imgSG/train
2022-12-07 11:02:56,232 - root - INFO -     pic_val_file: .\benchmark\ours\imgSG/val
2022-12-07 11:02:56,232 - root - INFO -     pic_test_file: .\benchmark\ours\imgSG/test
2022-12-07 11:02:56,232 - root - INFO -     rel_train_file: .\benchmark\ours\rel_1/train
2022-12-07 11:02:56,232 - root - INFO -     rel_val_file: .\benchmark\ours\rel_1/val
2022-12-07 11:02:56,232 - root - INFO -     rel_test_file: .\benchmark\ours\rel_1/test
2022-12-07 11:02:56,233 - root - INFO - Loading BERT pre-trained checkpoint.
2022-12-07 11:02:56,233 - transformers.configuration_utils - INFO - loading configuration file bert-base-uncased\config.json
2022-12-07 11:02:56,233 - transformers.configuration_utils - INFO - Model config BertConfig {
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

2022-12-07 11:02:56,233 - transformers.modeling_utils - INFO - loading weights file bert-base-uncased\pytorch_model.bin
2022-12-07 11:02:57,786 - transformers.modeling_utils - INFO - All model checkpoint weights were used when initializing BertModel.

2022-12-07 11:02:57,786 - transformers.modeling_utils - INFO - All the weights of BertModel were initialized from the model checkpoint at bert-base-uncased.
If your task is similar to the task the model of the ckeckpoint was trained on, you can already use BertModel for predictions without further training.
2022-12-07 11:02:58,948 - transformers.tokenization_utils_base - INFO - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at C:\Users\XJA/.cache\torch\transformers\26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
2022-12-07 11:02:59,820 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_train.txt with 12247 lines and 23 relations.
2022-12-07 11:04:02,508 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/train with 7356 objects.
2022-12-07 11:04:09,209 - root - INFO - Loaded sentence RE dataset aligned weights with 7647 samples.
2022-12-07 11:04:09,422 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_val.txt with 1624 lines and 23 relations.
2022-12-07 11:04:18,070 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/val with 931 objects.
2022-12-07 11:04:18,487 - root - INFO - Loaded sentence RE dataset aligned weights with 962 samples.
2022-12-07 11:04:18,758 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_test.txt with 1614 lines and 23 relations.
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 125, in _main
    prepare(preparation_data)
  File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 265, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "D:\Code\pyProject\Mega\example\train.py", line 112, in <module>
    framework = opennre.framework.SentenceRE(
  File "D:\Code\pyProject\Mega\opennre\framework\sentence_re.py", line 56, in __init__
    self.test_loader = SentenceRELoader(
  File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 204, in SentenceRELoader
    dataset = SentenceREDataset(text_path=text_path, rel_path=rel_path, pic_path=pic_path,
  File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 60, in __init__
    feature_list = [float(feature) for feature in feature_list]
  File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 60, in <listcomp>
    feature_list = [float(feature) for feature in feature_list]
MemoryError

train.py按照如上更改试试看，memoryerror应该还是反复读取数据的问题

from mega.

lentikr commented on August 21, 2024

train.py按照如上更改试试看，memoryerror应该还是反复读取数据的问题

@bingoarthur 谢谢！我尝试过了，请问这个修改前后有什么区别吗，我这里似乎只有路径分隔符变成了'/'，然后问题依然没有解决。

D:\Code\pyProject\Mega\venv\Scripts\python.exe D:\Code\pyProject\Mega\example\train.py --dataset ours --max_epoch 10 --batch_size 32 --metric micro_f1 --lr 2e-5 --ckpt MEGA 
修改前：
.\benchmark\ours\txt/ours_train.txt
.\benchmark\ours\txt/ours_val.txt
.\benchmark\ours\txt/ours_test.txt
.\benchmark\ours\imgSG/train
.\benchmark\ours\imgSG/val
.\benchmark\ours\imgSG/test
.\benchmark\ours\rel_1/train
.\benchmark\ours\rel_1/val
.\benchmark\ours\rel_1/test
.\benchmark\ours\ours_rel2id.json
修改后：
./benchmark/ours/txt/ours_train.txt
./benchmark/ours/txt/ours_val.txt
./benchmark/ours/txt/ours_test.txt
./benchmark/ours/imgSG/train
./benchmark/ours/imgSG/val
./benchmark/ours/imgSG/test
./benchmark/ours/rel_1/train
./benchmark/ours/rel_1/val
./benchmark/ours/rel_1/test
./benchmark/ours/ours_rel2id.json
2022-12-07 15:28:44,106 - root - INFO - Arguments:
2022-12-07 15:28:44,106 - root - INFO -     pretrain_path: bert-base-uncased
2022-12-07 15:28:44,106 - root - INFO -     ckpt: MEGA
2022-12-07 15:28:44,106 - root - INFO -     pooler: entity
2022-12-07 15:28:44,106 - root - INFO -     only_test: False
2022-12-07 15:28:44,106 - root - INFO -     mask_entity: False
2022-12-07 15:28:44,106 - root - INFO -     metric: micro_f1
2022-12-07 15:28:44,106 - root - INFO -     dataset: ours
2022-12-07 15:28:44,106 - root - INFO -     train_file: ./benchmark/ours/txt/ours_train.txt
2022-12-07 15:28:44,106 - root - INFO -     val_file: ./benchmark/ours/txt/ours_val.txt
2022-12-07 15:28:44,106 - root - INFO -     test_file: ./benchmark/ours/txt/ours_test.txt
2022-12-07 15:28:44,106 - root - INFO -     rel2id_file: ./benchmark/ours/ours_rel2id.json
2022-12-07 15:28:44,106 - root - INFO -     batch_size: 32
2022-12-07 15:28:44,106 - root - INFO -     lr: 2e-05
2022-12-07 15:28:44,106 - root - INFO -     max_length: 128
2022-12-07 15:28:44,106 - root - INFO -     max_epoch: 10
2022-12-07 15:28:44,106 - root - INFO -     rel_num: 1
2022-12-07 15:28:44,107 - root - INFO -     pic_train_file: ./benchmark/ours/imgSG/train
2022-12-07 15:28:44,107 - root - INFO -     pic_val_file: ./benchmark/ours/imgSG/val
2022-12-07 15:28:44,107 - root - INFO -     pic_test_file: ./benchmark/ours/imgSG/test
2022-12-07 15:28:44,107 - root - INFO -     rel_train_file: ./benchmark/ours/rel_1/train
2022-12-07 15:28:44,107 - root - INFO -     rel_val_file: ./benchmark/ours/rel_1/val
2022-12-07 15:28:44,107 - root - INFO -     rel_test_file: ./benchmark/ours/rel_1/test
2022-12-07 15:28:44,107 - root - INFO - Loading BERT pre-trained checkpoint.
2022-12-07 15:28:44,107 - transformers.configuration_utils - INFO - loading configuration file bert-base-uncased\config.json
2022-12-07 15:28:44,107 - transformers.configuration_utils - INFO - Model config BertConfig {
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

2022-12-07 15:28:44,108 - transformers.modeling_utils - INFO - loading weights file bert-base-uncased\pytorch_model.bin
2022-12-07 15:28:45,611 - transformers.modeling_utils - INFO - All model checkpoint weights were used when initializing BertModel.

2022-12-07 15:28:45,611 - transformers.modeling_utils - INFO - All the weights of BertModel were initialized from the model checkpoint at bert-base-uncased.
If your task is similar to the task the model of the ckeckpoint was trained on, you can already use BertModel for predictions without further training.
2022-12-07 15:28:45,637 - transformers.tokenization_utils_base - INFO - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at C:\Users\XJA/.cache\torch\transformers\26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
2022-12-07 15:28:46,531 - root - INFO - Loaded sentence RE dataset ./benchmark/ours/txt/ours_train.txt with 12247 lines and 23 relations.
...（其余部分到MemoryError是一样的）

from mega.

bingoarthur commented on August 21, 2024

@lentikr 奥，你是不是反复加载的bert，把bert用本地路径试一下

from mega.

lentikr commented on August 21, 2024

@lentikr 奥，你是不是反复加载的bert，把bert用本地路径试一下

似乎不是这个问题，我调试了下，似乎是在进度条tqdm处出现了问题，在图中红框处的代码运行后会跳回train.py函数的起始部分重新运行一遍。很费解。

from mega.

lentikr commented on August 21, 2024

@lentikr 奥，你是不是反复加载的bert，把bert用本地路径试一下

@bingoarthur 方便问一下您那边tqdm库的版本吗？

from mega.

bingoarthur commented on August 21, 2024

@lentikr tqdm 4.64.1 pyton3.8 pytorch1.12.1

from mega.

lentikr commented on August 21, 2024

本issue可以终结了，使用最开始打不开的图片数据集即可utf-8编码，不要换成可以查看的图片数据集。

from mega.

zhumying commented on August 21, 2024

2022-12-07 11:01:38,958 - root - INFO - Arguments:
2022-12-07 11:01:38,958 - root - INFO -     pretrain_path: bert-base-uncased
2022-12-07 11:01:38,958 - root - INFO -     ckpt: MEGA
2022-12-07 11:01:38,958 - root - INFO -     pooler: entity
2022-12-07 11:01:38,958 - root - INFO -     only_test: False
2022-12-07 11:01:38,958 - root - INFO -     mask_entity: False
2022-12-07 11:01:38,958 - root - INFO -     metric: micro_f1
2022-12-07 11:01:38,958 - root - INFO -     dataset: ours
2022-12-07 11:01:38,958 - root - INFO -     train_file: .\benchmark\ours\txt/ours_train.txt
2022-12-07 11:01:38,958 - root - INFO -     val_file: .\benchmark\ours\txt/ours_val.txt
2022-12-07 11:01:38,959 - root - INFO -     test_file: .\benchmark\ours\txt/ours_test.txt
2022-12-07 11:01:38,959 - root - INFO -     rel2id_file: .\benchmark\ours\ours_rel2id.json
2022-12-07 11:01:38,959 - root - INFO -     batch_size: 32
2022-12-07 11:01:38,959 - root - INFO -     lr: 2e-05
2022-12-07 11:01:38,959 - root - INFO -     max_length: 128
2022-12-07 11:01:38,959 - root - INFO -     max_epoch: 10
2022-12-07 11:01:38,959 - root - INFO -     rel_num: 1
2022-12-07 11:01:38,959 - root - INFO -     pic_train_file: .\benchmark\ours\imgSG/train
2022-12-07 11:01:38,959 - root - INFO -     pic_val_file: .\benchmark\ours\imgSG/val
2022-12-07 11:01:38,959 - root - INFO -     pic_test_file: .\benchmark\ours\imgSG/test
2022-12-07 11:01:38,959 - root - INFO -     rel_train_file: .\benchmark\ours\rel_1/train
2022-12-07 11:01:38,959 - root - INFO -     rel_val_file: .\benchmark\ours\rel_1/val
2022-12-07 11:01:38,959 - root - INFO -     rel_test_file: .\benchmark\ours\rel_1/test
2022-12-07 11:01:38,960 - root - INFO - Loading BERT pre-trained checkpoint.
2022-12-07 11:01:38,960 - transformers.configuration_utils - INFO - loading configuration file bert-base-uncased\config.json
2022-12-07 11:01:38,961 - transformers.configuration_utils - INFO - Model config BertConfig {
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

2022-12-07 11:01:38,961 - transformers.modeling_utils - INFO - loading weights file bert-base-uncased\pytorch_model.bin
2022-12-07 11:01:42,688 - transformers.modeling_utils - INFO - All model checkpoint weights were used when initializing BertModel.

2022-12-07 11:01:42,688 - transformers.modeling_utils - INFO - All the weights of BertModel were initialized from the model checkpoint at bert-base-uncased.
If your task is similar to the task the model of the ckeckpoint was trained on, you can already use BertModel for predictions without further training.
2022-12-07 11:01:44,022 - transformers.tokenization_utils_base - INFO - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at C:\Users\XJA/.cache\torch\transformers\26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
2022-12-07 11:01:46,404 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_train.txt with 12247 lines and 23 relations.
2022-12-07 11:02:34,042 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/train with 7356 objects.
2022-12-07 11:02:36,835 - root - INFO - Loaded sentence RE dataset aligned weights with 7647 samples.
2022-12-07 11:02:37,215 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_val.txt with 1624 lines and 23 relations.
2022-12-07 11:02:43,524 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/val with 931 objects.
2022-12-07 11:02:43,728 - root - INFO - Loaded sentence RE dataset aligned weights with 962 samples.
2022-12-07 11:02:44,126 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_test.txt with 1614 lines and 23 relations.
2022-12-07 11:02:52,278 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/test with 914 objects.
2022-12-07 11:02:52,535 - root - INFO - Loaded sentence RE dataset aligned weights with 955 samples.
2022-12-07 11:02:54,933 - root - INFO - === Epoch 0 train ===
  0%|                                                                                 | 0/383 [00:00<?, ?it/s]2022-12-07 11:02:56,232 - root - INFO - Arguments:
2022-12-07 11:02:56,232 - root - INFO -     pretrain_path: bert-base-uncased
2022-12-07 11:02:56,232 - root - INFO -     ckpt: MEGA
2022-12-07 11:02:56,232 - root - INFO -     pooler: entity
2022-12-07 11:02:56,232 - root - INFO -     only_test: False
2022-12-07 11:02:56,232 - root - INFO -     mask_entity: False
2022-12-07 11:02:56,232 - root - INFO -     metric: micro_f1
2022-12-07 11:02:56,232 - root - INFO -     dataset: ours
2022-12-07 11:02:56,232 - root - INFO -     train_file: .\benchmark\ours\txt/ours_train.txt
2022-12-07 11:02:56,232 - root - INFO -     val_file: .\benchmark\ours\txt/ours_val.txt
2022-12-07 11:02:56,232 - root - INFO -     test_file: .\benchmark\ours\txt/ours_test.txt
2022-12-07 11:02:56,232 - root - INFO -     rel2id_file: .\benchmark\ours\ours_rel2id.json
2022-12-07 11:02:56,232 - root - INFO -     batch_size: 32
2022-12-07 11:02:56,232 - root - INFO -     lr: 2e-05
2022-12-07 11:02:56,232 - root - INFO -     max_length: 128
2022-12-07 11:02:56,232 - root - INFO -     max_epoch: 10
2022-12-07 11:02:56,232 - root - INFO -     rel_num: 1
2022-12-07 11:02:56,232 - root - INFO -     pic_train_file: .\benchmark\ours\imgSG/train
2022-12-07 11:02:56,232 - root - INFO -     pic_val_file: .\benchmark\ours\imgSG/val
2022-12-07 11:02:56,232 - root - INFO -     pic_test_file: .\benchmark\ours\imgSG/test
2022-12-07 11:02:56,232 - root - INFO -     rel_train_file: .\benchmark\ours\rel_1/train
2022-12-07 11:02:56,232 - root - INFO -     rel_val_file: .\benchmark\ours\rel_1/val
2022-12-07 11:02:56,232 - root - INFO -     rel_test_file: .\benchmark\ours\rel_1/test
2022-12-07 11:02:56,233 - root - INFO - Loading BERT pre-trained checkpoint.
2022-12-07 11:02:56,233 - transformers.configuration_utils - INFO - loading configuration file bert-base-uncased\config.json
2022-12-07 11:02:56,233 - transformers.configuration_utils - INFO - Model config BertConfig {
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

2022-12-07 11:02:56,233 - transformers.modeling_utils - INFO - loading weights file bert-base-uncased\pytorch_model.bin
2022-12-07 11:02:57,786 - transformers.modeling_utils - INFO - All model checkpoint weights were used when initializing BertModel.

2022-12-07 11:02:57,786 - transformers.modeling_utils - INFO - All the weights of BertModel were initialized from the model checkpoint at bert-base-uncased.
If your task is similar to the task the model of the ckeckpoint was trained on, you can already use BertModel for predictions without further training.
2022-12-07 11:02:58,948 - transformers.tokenization_utils_base - INFO - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at C:\Users\XJA/.cache\torch\transformers\26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
2022-12-07 11:02:59,820 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_train.txt with 12247 lines and 23 relations.
2022-12-07 11:04:02,508 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/train with 7356 objects.
2022-12-07 11:04:09,209 - root - INFO - Loaded sentence RE dataset aligned weights with 7647 samples.
2022-12-07 11:04:09,422 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_val.txt with 1624 lines and 23 relations.
2022-12-07 11:04:18,070 - root - INFO - Loaded image feature dataset .\benchmark\ours\imgSG/val with 931 objects.
2022-12-07 11:04:18,487 - root - INFO - Loaded sentence RE dataset aligned weights with 962 samples.
2022-12-07 11:04:18,758 - root - INFO - Loaded sentence RE dataset .\benchmark\ours\txt/ours_test.txt with 1614 lines and 23 relations.
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 125, in _main
    prepare(preparation_data)
  File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "E:\Software\Scoop\apps\python38\current\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 265, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "E:\Software\Scoop\apps\python38\current\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "D:\Code\pyProject\Mega\example\train.py", line 112, in <module>
    framework = opennre.framework.SentenceRE(
  File "D:\Code\pyProject\Mega\opennre\framework\sentence_re.py", line 56, in __init__
    self.test_loader = SentenceRELoader(
  File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 204, in SentenceRELoader
    dataset = SentenceREDataset(text_path=text_path, rel_path=rel_path, pic_path=pic_path,
  File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 60, in __init__
    feature_list = [float(feature) for feature in feature_list]
  File "D:\Code\pyProject\Mega\opennre\framework\data_loader.py", line 60, in <listcomp>
    feature_list = [float(feature) for feature in feature_list]
MemoryError

请问你有遇到过这个问题吗？？

from mega.

加载dataset时，读取图片特征报错 about mega HOT 15 OPEN

Comments (15)

Related Issues (18)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent