Code Monkey home page Code Monkey logo

rebel's People

Contributors

davidfrompandora avatar littlepea13 avatar m0baxter avatar tomasonjo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

rebel's Issues

Adding a softmax layer with custom relations

Hello :) Thank you first of all for all your work on the relation extraction domain.

I would like to ask if it is possible to add a softmax layer to the REBEL model architecture that can assign a probability to a list of candidate relations about which is the most fitting. The purpose of doing this is to have only a few dozen custom relations to choose from in the final stage- do you think this would be possible?
Alternatively, is there any way you would recommended filtering the relation list of the pretrained REBEL model to have a user-defined list of about 100 relations? Any guidance regarding this would be very helpful.

Thanks again!

Datasets Related Problems

Excuse me, I run your model with NYT datasets. However, it failed.
I guess whether my dataset is right or wrong. the error printed seems to show there are some problems with the data type.
Please see the pictures as follow.

picture1: I run the train.py, the problems occur in load_datasets functions.
picture2: the NYT dataset(train.data). I download from the Copy_RE github.
picture3: I write a test function to run the load_datasets functions. It still cannot work.

so, I don't know whether my dataset is right or wrong.
the follow code also cannot run. because there is no spo_list or spo_details in datasets.
list_relations = zip(row['spo_list'], row['spo_details'])

picture1
picutre2
picture3

Thank you very much!

About the statistics of the dataset

Hi, nice work. I try to replica the pretraining process, but using "rebel/datasets/rebel-short.py /", i can not get the same train/val/test num writen in the paper as
image

plz enlight me some details!±±

How can I use the docred dataset with strict evaluation

When I am trying to train the model on docred dataset, inputing python train.py model=rebel_model data=docred_data train=docred_train, but the model can't run correctly and return

Traceback (most recent call last):
File "/home/weimin/rebel/src/train.py", line 150, in main
train(conf)
File "/home/weimin/rebel/src/train.py", line 146, in train
trainer.fit(pl_module, datamodule=pl_data_module)
File "/home/weimin/anaconda3/envs/rebel/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 510, in fit
results = self.accelerator_backend.train()
File "/home/weimin/anaconda3/envs/rebel/lib/python3.9/site-packages/pytorch_lightning/accelerators/accelerator.py", line 57, in train
return self.train_or_test()
File "/home/weimin/anaconda3/envs/rebel/lib/python3.9/site-packages/pytorch_lightning/accelerators/accelerator.py", line 74, in train_or_test
results = self.trainer.train()
File "/home/weimin/anaconda3/envs/rebel/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 532, in train
self.run_sanity_check(self.get_model())
File "/home/weimin/anaconda3/envs/rebel/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 730, in run_sanity_check
_, eval_results = self.run_evaluation(max_batches=self.num_sanity_val_batches)
File "/home/weimin/anaconda3/envs/rebel/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 662, in run_evaluation
deprecated_eval_results = self.evaluation_loop.evaluation_epoch_end()
File "/home/weimin/anaconda3/envs/rebel/lib/python3.9/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 210, in evaluation_epoch_end
deprecated_results = self.__run_eval_epoch_end(self.num_dataloaders, using_eval_result)
File "/home/weimin/anaconda3/envs/rebel/lib/python3.9/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 248, in __run_eval_epoch_end
eval_results = model.validation_epoch_end(eval_results)
File "/home/weimin/rebel/src/pl_modules.py", line 392, in validation_epoch_end
scores, precision, recall, f1 = re_score([item for pred in output for item in pred['predictions']], [item for pred in output for item in pred['labels']], list(relations_docred.values()), "strict")
File "/home/weimin/rebel/src/score.py", line 174, in re_score
pred_rels = {(rel["head"], rel["head_type"], rel["tail"], rel["tail_type"]) for rel in pred_sent if
File "/home/weimin/rebel/src/score.py", line 174, in
pred_rels = {(rel["head"], rel["head_type"], rel["tail"], rel["tail_type"]) for rel in pred_sent if
KeyError: 'head_type'

And I check the code in the function re_score, I found the output of this model only contains "head", "tail" and "type". What's wrong with it?

Unexpectedly bad performance on bart-base

Hi.
I wanted to see what the results would be like with bart-base. I trained on CONLL04 without changing any other parameter, but the performance is not nearly as good.

Here are my results:

processed 288 sentences with 421 relations; found: 444 relations; correct: 46.
	ALL	 TP: 46;	FP: 30;	FN: 360
		(m avg): precision: 60.53;	recall: 11.33;	f1: 19.09 (micro)
		(M avg): precision: 58.83;	recall: 10.79;	f1: 17.58 (Macro)

	killed by: 	TP: 3;	FP: 1;	FN: 44;	precision: 75.00;	recall: 6.38;	f1: 11.76;	4
	residence: 	TP: 3;	FP: 5;	FN: 95;	precision: 37.50;	recall: 3.06;	f1: 5.66;	8
	location: 	TP: 18;	FP: 6;	FN: 71;	precision: 75.00;	recall: 20.22;	f1: 31.86;	24
	headquarters location: 	TP: 17;	FP: 13;	FN: 79;	precision: 56.67;	recall: 17.71;	f1: 26.98;	30
	employer: 	TP: 5;	FP: 5;	FN: 71;	precision: 50.00;	recall: 6.58;	f1: 11.63;	10

I tested on bart-large, and it works as expected:

processed 288 sentences with 421 relations; found: 362 relations; correct: 273.
	ALL	 TP: 273;	FP: 87;	FN: 133
		(m avg): precision: 75.83;	recall: 67.24;	f1: 71.28 (micro)
		(M avg): precision: 77.78;	recall: 69.60;	f1: 73.16 (Macro)

	killed by: 	TP: 43;	FP: 5;	FN: 4;	precision: 89.58;	recall: 91.49;	f1: 90.53;	48
	residence: 	TP: 66;	FP: 37;	FN: 32;	precision: 64.08;	recall: 67.35;	f1: 65.67;	103
	location: 	TP: 53;	FP: 17;	FN: 36;	precision: 75.71;	recall: 59.55;	f1: 66.67;	70
	headquarters location: 	TP: 60;	FP: 14;	FN: 36;	precision: 81.08;	recall: 62.50;	f1: 70.59;	74
	employer: 	TP: 51;	FP: 14;	FN: 25;	precision: 78.46;	recall: 67.11;	f1: 72.34;	65

Increasing the number of steps did not help. Are these results expected? Do you happen to know how much impact the model size has on performance?

Error when training on DocRED: expected str, bytes or os.PathLike object, not DataFilesList

When evaluating on the DocRED dataset, I'm receiving the following error with traceback. I tried replacing the model variable with the path string but made no difference. I would be grateful for any hints regarding this.

`
C:\Users\i.abbasi\rebel\src> python train.py model=rebel_model data=docred_data train=docred_train
Global seed set to 42
[2022-08-07 18:30:34,790][datasets.builder][WARNING] - Using custom data configuration default-e2fda1423ea7b8aa
Downloading and preparing dataset docred_typed/default to C:\Users\i.abbasi.cache\huggingface\datasets\docred_typed\default-e2fda1423ea7b8aa\0.0.0\2cc6999b276b6aa2b2af5101b416c33155e5f19e6f0b26864a2312d1aa57b175...
Generating train split: 0 examples [00:00, ? examples/s][2022-08-07 18:30:35,303][root][INFO] - generating examples from = [WindowsPath('C:/Users/i.abbasi/rebel/data/docred_joint/train_joint.json')]
Traceback (most recent call last):
File "train.py", line 106, in main
train(conf)
File "train.py", line 54, in train
pl_data_module = BasePLDataModule(conf, tokenizer, model)
File "C:\Users\i.abbasi\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\pytorch_lightning\core\datamodule.py", line 49, in call
obj = type.call(cls, *args, **kwargs)
File "C:\Users\i.abbasi\rebel\src\pl_data_modules.py", line 68, in init
self.datasets = load_dataset(conf.dataset_name, data_files={'train': conf.train_file, 'dev': conf.validation_file, 'test': conf.test_file})
File "C:\Users\i.abbasi\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\datasets\load.py", line 1679, in load_dataset
builder_instance.download_and_prepare(
File "C:\Users\i.abbasi\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\datasets\builder.py", line 704, in download_and_prepare
self._download_and_prepare(
File "C:\Users\i.abbasi\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\datasets\builder.py", line 1221, in _download_and_prepare
super()._download_and_prepare(dl_manager, verify_infos, check_duplicate_keys=verify_infos)
File "C:\Users\i.abbasi\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\datasets\builder.py", line 793, in _download_and_prepare
self._prepare_split(split_generator, **prepare_split_kwargs)
File "C:\Users\i.abbasi\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\datasets\builder.py", line 1204, in _prepare_split
for key, record in logging.tqdm(
File "C:\Users\i.abbasi\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\tqdm\std.py", line 1195, in iter
for obj in iterable:
File "C:\Users\i.abbasi.cache\huggingface\modules\datasets_modules\datasets\docred_typed\2cc6999b276b6aa2b2af5101b416c33155e5f19e6f0b26864a2312d1aa57b175\docred_typed.py", line 101, in _generate_examples
with open(filepath) as json_file:
File "C:\Users\i.abbasi\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\datasets\streaming.py", line 67, in wrapper
return function(*args, use_auth_token=use_auth_token, **kwargs)
File "C:\Users\i.abbasi\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\datasets\download\streaming_download_manager.py", line 423, in xopen
file = _as_posix(PurePath(file))
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\lib\pathlib.py", line 651, in new
return cls._from_parts(args)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\lib\pathlib.py", line 683, in _from_parts
drv, root, parts = self._parse_args(args)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\lib\pathlib.py", line 667, in _parse_args
a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not DataFilesList

`

RuntimeError: CUDA error: device-side assert triggered

I trained a custom dataset with five entities based on the CONLL format, but when testing the checkpoint, I'm getting the following error:

/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [56,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [57,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [58,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [59,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [60,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [61,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [62,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [63,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [0,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [1,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [2,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [3,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [4,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [5,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [6,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [7,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [8,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [9,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [10,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [11,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [12,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [13,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [14,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [15,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [16,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [17,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [18,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [19,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [20,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [21,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [22,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [23,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [24,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [25,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [26,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [27,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [28,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [29,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [30,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [31,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [96,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [97,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [98,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [99,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [100,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [101,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [102,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [103,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [104,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [105,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [106,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [107,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [108,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [109,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [110,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [111,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [112,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [113,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [114,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [115,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [116,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [117,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [118,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [119,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [120,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [121,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [122,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [123,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [124,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [125,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [126,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [153,0,0], thread: [127,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
Traceback (most recent call last):
  File "test.py", line 75, in main
    train(conf)
  File "test.py", line 70, in train
    trainer.test(pl_module, test_dataloaders=pl_data_module.test_dataloader())
  File "/anaconda/envs/we/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 792, in test
    results = self.__test_given_model(model, test_dataloaders)
  File "/anaconda/envs/we/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 855, in __test_given_model
    results = self.fit(model)
  File "/anaconda/envs/we/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 510, in fit
    results = self.accelerator_backend.train()
  File "/anaconda/envs/we/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 57, in train
    return self.train_or_test()
  File "/anaconda/envs/we/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 71, in train_or_test
    results = self.trainer.run_test()
  File "/anaconda/envs/we/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 699, in run_test
    eval_loop_results, _ = self.run_evaluation()
  File "/anaconda/envs/we/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 646, in run_evaluation
    output = self.evaluation_loop.evaluation_step(batch, batch_idx, dataloader_idx)
  File "/anaconda/envs/we/lib/python3.8/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 177, in evaluation_step
    output = self.trainer.accelerator_backend.test_step(args)
  File "/anaconda/envs/we/lib/python3.8/site-packages/pytorch_lightning/accelerators/gpu_accelerator.py", line 76, in test_step
    return self._step(self.trainer.model.test_step, args)
  File "/anaconda/envs/we/lib/python3.8/site-packages/pytorch_lightning/accelerators/gpu_accelerator.py", line 65, in _step
    output = model_step(*args)
  File "/mnt/batch/tasks/shared/LS_root/mounts/clusters/iqranlp/code/Users/i.abbasi/rebel/src/pl_modules.py", line 346, in test_step
    forward_output = self.forward(batch, labels)
  File "/mnt/batch/tasks/shared/LS_root/mounts/clusters/iqranlp/code/Users/i.abbasi/rebel/src/pl_modules.py", line 121, in forward
    outputs = self.model(**inputs, use_cache=False, return_dict = True, output_hidden_states=True)
  File "/anaconda/envs/we/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/anaconda/envs/we/lib/python3.8/site-packages/transformers/models/bart/modeling_bart.py", line 1348, in forward
    outputs = self.model(
  File "/anaconda/envs/we/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/anaconda/envs/we/lib/python3.8/site-packages/transformers/models/bart/modeling_bart.py", line 1235, in forward
    decoder_outputs = self.decoder(
  File "/anaconda/envs/we/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/anaconda/envs/we/lib/python3.8/site-packages/transformers/models/bart/modeling_bart.py", line 1023, in forward
    inputs_embeds = self.embed_tokens(input_ids) * self.embed_scale
  File "/anaconda/envs/we/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/anaconda/envs/we/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 158, in forward
    return F.embedding(
  File "/anaconda/envs/we/lib/python3.8/site-packages/torch/nn/functional.py", line 2043, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: CUDA error: device-side assert triggered

Thie error came about when I resized the model token embeddings from [50272, 1024] to [50276, 1024] to fit the checkpoint. Could you please suggest a workaround for this?

AssertionError: Non-consecutive added token '<obj>' found. Should have index 50272 but has index 50265 in saved vocabulary.

Traceback:
File "/home/rahulpal/anaconda3/envs/rebel/lib/python3.7/site-packages/streamlit/script_runner.py", line 338, in _run_script
exec(code, module.dict)
File "/home/rahulpal/Documents/rebel-main/demo.py", line 57, in
tokenizer, model, dataset = load_models()
File "/home/rahulpal/anaconda3/envs/rebel/lib/python3.7/site-packages/streamlit/caching.py", line 573, in wrapped_func
return get_or_create_cached_value()
File "/home/rahulpal/anaconda3/envs/rebel/lib/python3.7/site-packages/streamlit/caching.py", line 555, in get_or_create_cached_value
return_value = func(*args, **kwargs)
File "/home/rahulpal/Documents/rebel-main/demo.py", line 18, in load_models
tokenizer = AutoTokenizer.from_pretrained("Babelscape/rebel-large")
File "/home/rahulpal/anaconda3/envs/rebel/lib/python3.7/site-packages/transformers/models/auto/tokenization_auto.py", line 416, in from_pretrained
return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/home/rahulpal/anaconda3/envs/rebel/lib/python3.7/site-packages/transformers/tokenization_utils_base.py", line 1705, in from_pretrained
resolved_vocab_files, pretrained_model_name_or_path, init_configuration, *init_inputs, **kwargs
File "/home/rahulpal/anaconda3/envs/rebel/lib/python3.7/site-packages/transformers/tokenization_utils_base.py", line 1811, in _from_pretrained
f"Non-consecutive added token '{token}' found. "

when train docred dataset issue

Hi,
I get the following error when trying to train docred dataset
It seems to me that KeyError: 'labels' is the main problem, how do I fix it?
Detailed log is as below..
It is a situation where you must train using the docred dataset. Please help me

/home/kdk/anaconda3/envs/rebel/lib/python3.7/site-packages/pytorch_lightning/utilities/distributed.py:50: UserWarning: The dataloader, train dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the num_workers argument(try 4 which is the number of cpus on this machine) in theDataLoader` init to improve performance.
warnings.warn(*args, **kwargs)
Epoch 0: 32%|███████████████████████████████████████▏ | 499/1579 [02:10<04:42, 3.82it/s, loss=2.85, v_num=y2u5]Saving latest checkpoint...
Traceback (most recent call last):
File "/home/kdk/anaconda3/envs/rebel/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 561, in train
self.train_loop.run_training_epoch()
File "/home/kdk/anaconda3/envs/rebel/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 556, in run_training_epoch
self.on_train_batch_end(epoch_output, batch_end_outputs, batch, batch_idx, dataloader_idx)
File "/home/kdk/anaconda3/envs/rebel/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 226, in on_train_batch_end
self.trainer.call_hook('on_train_batch_end', batch_end_outputs, batch, batch_idx, dataloader_idx)
File "/home/kdk/anaconda3/envs/rebel/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 925, in call_hook
trainer_hook(*args, **kwargs)
File "/home/kdk/anaconda3/envs/rebel/lib/python3.7/site-packages/pytorch_lightning/trainer/callback_hook.py", line 147, in on_train_batch_end
callback.on_train_batch_end(self, self.get_model(), outputs, batch, batch_idx, dataloader_idx)
File "/home/kdk/rebel/src/generate_samples.py", line 39, in on_train_batch_end
labels = batch.pop("labels")
File "/home/kdk/anaconda3/envs/rebel/lib/python3.7/_collections_abc.py", line 795, in pop
value = self[key]
File "/home/kdk/anaconda3/envs/rebel/lib/python3.7/site-packages/transformers/tokenization_utils_base.py", line 230, in getitem
return self.data[item]
KeyError: 'labels'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "train.py", line 151, in
main()
File "/home/kdk/anaconda3/envs/rebel/lib/python3.7/site-packages/hydra/main.py", line 37, in decorated_main
strict=strict,
File "/home/kdk/anaconda3/envs/rebel/lib/python3.7/site-packages/hydra/_internal/utils.py", line 347, in _run_hydra
lambda: hydra.run(
File "/home/kdk/anaconda3/envs/rebel/lib/python3.7/site-packages/hydra/_internal/utils.py", line 201, in run_and_report
raise ex
File "/home/kdk/anaconda3/envs/rebel/lib/python3.7/site-packages/hydra/_internal/utils.py", line 198, in run_and_report
return func()
File "/home/kdk/anaconda3/envs/rebel/lib/python3.7/site-packages/hydra/_internal/utils.py", line 350, in
overrides=args.overrides,
File "/home/kdk/anaconda3/envs/rebel/lib/python3.7/site-packages/hydra/_internal/hydra.py", line 112, in run
configure_logging=with_log_configuration,
File "/home/kdk/anaconda3/envs/rebel/lib/python3.7/site-packages/hydra/core/utils.py", line 127, in run_job
ret.return_value = task_function(task_cfg)
File "train.py", line 147, in main
train(conf)
File "train.py", line 143, in train
trainer.fit(pl_module, datamodule=pl_data_module)
File "/home/kdk/anaconda3/envs/rebel/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 510, in fit
results = self.accelerator_backend.train()
File "/home/kdk/anaconda3/envs/rebel/lib/python3.7/site-packages/pytorch_lightning/accelerators/accelerator.py", line 57, in train
return self.train_or_test()
File "/home/kdk/anaconda3/envs/rebel/lib/python3.7/site-packages/pytorch_lightning/accelerators/accelerator.py", line 74, in train_or_test
results = self.trainer.train()
File "/home/kdk/anaconda3/envs/rebel/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 592, in train
self.train_loop.on_train_end()
File "/home/kdk/anaconda3/envs/rebel/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 156, in on_train_end
self.check_checkpoint_callback(should_save=True, is_last=True)
File "/home/kdk/anaconda3/envs/rebel/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 190, in check_checkpoint_callback
callback.on_validation_end(self.trainer, model)
File "/home/kdk/anaconda3/envs/rebel/lib/python3.7/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 204, in on_validation_end
self.save_checkpoint(trainer, pl_module)
File "/home/kdk/anaconda3/envs/rebel/lib/python3.7/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 239, in save_checkpoint
self._validate_monitor_key(trainer)
File "/home/kdk/anaconda3/envs/rebel/lib/python3.7/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 517, in _validate_monitor_key
raise MisconfigurationException(m)
pytorch_lightning.utilities.exceptions.MisconfigurationException: ModelCheckpoint(monitor='val_F1_micro') not found in the returned metrics: ['loss']. HINT: Did you call self.log('val_F1_micro', tensor) in the LightningModule?

test error

when I run ' pythontest.py model=rebel_model data=conll04_data train=conll04_train do_predict=True checkpoint_path="path_to_checkpoint" ', I can't load my model, the error is 'omegaconf.errors.ConfigKeyError: 'str' object has no attribute 'dict''.
image
image
I am not familiar with pytorch-lightning, could u help me?

Error in test.py

Hi,
I found below error in the code which shows there is no file or directory FileNotFoundError: [Errno 2] No such file or directory: '/home/manxoor/RelationExtraction/Benchmark_eval/rebel-main/src/outputs/2022-02-17/15-12-18/path_to_checkpoint'

Log:
python test.py model=rebel_model data=nyt_data train=nyt_train do_predict=True checkpoint_path="path_to_checkpoint"
2022-02-17 15:12:16.393884: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/home/manxoor/opt/openmpi/lib
2022-02-17 15:12:16.393920: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Global seed set to 42
[2022-02-17 15:12:32,330][datasets.builder][WARNING] - Using custom data configuration default-26df38178e3733aa
[2022-02-17 15:12:32,335][datasets.builder][WARNING] - Reusing dataset nyt (/home/manxoor/.cache/huggingface/datasets/nyt/default-26df38178e3733aa/0.0.0/1b5d3bcc4eb4689e4399ba481b0057e167e20cf2a8b0174b677a44c719a131c2)
Traceback (most recent call last):
File "test.py", line 114, in main
train(conf)
File "test.py", line 98, in train
pl_module = pl_module.load_from_checkpoint(checkpoint_path = conf.checkpoint_path, config = config, tokenizer = tokenizer, model = model)
File "/home/manxoor/anaconda3/lib/python3.7/site-packages/pytorch_lightning/core/saving.py", line 137, in load_from_checkpoint
checkpoint = pl_load(checkpoint_path, map_location=lambda storage, loc: storage)
File "/home/manxoor/anaconda3/lib/python3.7/site-packages/pytorch_lightning/utilities/cloud_io.py", line 31, in load
with fs.open(path_or_url, "rb") as f:
File "/home/manxoor/anaconda3/lib/python3.7/site-packages/fsspec/spec.py", line 1036, in open
**kwargs,
File "/home/manxoor/anaconda3/lib/python3.7/site-packages/fsspec/implementations/local.py", line 155, in _open
return LocalFileOpener(path, mode, fs=self, **kwargs)
File "/home/manxoor/anaconda3/lib/python3.7/site-packages/fsspec/implementations/local.py", line 250, in init
self._open()
File "/home/manxoor/anaconda3/lib/python3.7/site-packages/fsspec/implementations/local.py", line 255, in _open
self.f = open(self.path, mode=self.mode)
FileNotFoundError: [Errno 2] No such file or directory: '/home/manxoor/RelationExtraction/Benchmark_eval/rebel-main/src/outputs/2022-02-17/15-12-18/path_to_checkpoint'

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

Error occurs when running test.py: omegaconf.errors.ConfigKeyError: 'str' object has no attribute '__dict__'

An error occurs when running test.py.
File "test.py", line 98, in train
pl_module = pl_module.load_from_checkpoint(checkpoint_path = conf.checkpoint_path, config = config, tokenizer = tokenizer, model = model)
File "/home/xx/.conda/envs/tingzhang/lib/python3.7/site-packages/pytorch_lightning/core/saving.py", line 157, in load_from_checkpoint
checkpoint[cls.CHECKPOINT_HYPER_PARAMS_KEY].update(kwargs)
File "/home/xx/.conda/envs/tingzhang/lib/python3.7/_collections_abc.py", line 841, in update
self[key] = other[key]
File "/home/xx/.conda/envs/tingzhang/lib/python3.7/site-packages/omegaconf/dictconfig.py", line 259, in setitem
key=key, value=value, type_override=ConfigKeyError, cause=e
File "/home/xx/.conda/envs/tingzhang/lib/python3.7/site-packages/omegaconf/base.py", line 101, in _format_and_raise
type_override=type_override,
File "/home/xx/.conda/envs/tingzhang/lib/python3.7/site-packages/omegaconf/_utils.py", line 694, in format_and_raise
_raise(ex, cause)
File "/home/xx/.conda/envs/tingzhang/lib/python3.7/site-packages/omegaconf/_utils.py", line 610, in _raise
raise ex # set end OC_CAUSE=1 for full backtrace
omegaconf.errors.ConfigKeyError: 'str' object has no attribute 'dict'
full_key: config
reference_type=Optional[Dict[Union[str, Enum], Any]]
object_type=dict

KeyError: 'labels' in generate_samples.py

I am trying a quick run over the CoNLL04 dataset. I downloaded the model into model/Rebel-large and the ConLL04 dataset into data/conll04. I then ran the code with the command train.py model=rebel_model data=conll04_data train=conll04_train. I get the following error:

File "src/generate_samples.py", line 39, in on_train_batch_end labels = batch.pop("labels") KeyError: 'labels'

I checked the batch object, it contains the following attributes: inputs_ids, attention_mask and decoder_input_ids.
It does not contain a labels attribute.

OSError: Can't load config for 'rebel/model/Rebel-large'

After following the training steps through the following command on Notebook:

!HF_DATASETS_OFFLINE=1 TRANSFORMERS_OFFLINE=1 python rebel/src/train.py model=rebel_model data=nyt_data train=nyt_train

I am getting this error. I have checked and the model path exists in the directory. Do you have any idea why this might be happening?

OSError: Can't load config for 'rebel/model/Rebel-large'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'rebel/model/Rebel-large' is the correct path to a directory containing a config.json file

relation_counts.tsv has 230 relations not 220

Hi,
great paper!

In your paper you state that you train on 220 relations but relation_counts.tsv has 230 relations. Is that a typo in the paper or is the relations file outdated?

To pre-train our model, we use a sentence-
level version of it, where only relations between
entities present in each sentence are kept. We keep
the 220 most frequent relations in the train split.
We fine-tune REBEL (using BART-large as the
base model) on the silver dataset for 6 epochs. We
refer to the resulting model as REBELpre−training.
While REBELpre−training is in and of itself capa-
ble of extracting relations subsuming about 220
types, we show that it also functions as a base step
for downstream RE and RC tasks, which are fine-
tuned on top of it.

CUDA out of memory

Hi,

when I try to execute the training procedure with the command:
python src/train.py model=rebel_model data=conll04_data train=conll04_train

I get the following error:

Epoch 0:  50%|███████████████████████████████████████████████████████████████████████                                                                       | 3/6 [00:00<00:00,  4.08it/s, loss=3.07, v_num=eoyqhuggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...                                           | 0/3 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "src/train.py", line 113, in main
    train(conf)
  File "src/train.py", line 109, in train
    trainer.fit(pl_module, datamodule=pl_data_module)
  File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 458, in fit
    self._run(model)
  File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 756, in _run
    self.dispatch()
  File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 797, in dispatch
    self.accelerator.start_training(self)
  File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 96, in start_training
    self.training_type_plugin.start_training(trainer)
  File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 144, in start_training
    self._results = trainer.run_stage()
  File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 807, in run_stage
    return self.run_train()
  File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 869, in run_train
    self.train_loop.run_training_epoch()
  File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/training_loop.py", line 576, in run_training_epoch
    self.trainer.run_evaluation(on_epoch=True)
  File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 962, in run_evaluation
    output = self.evaluation_loop.evaluation_step(batch, batch_idx, dataloader_idx)
  File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 174, in evaluation_step
    output = self.trainer.accelerator.validation_step(args)
  File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 226, in validation_step
    return self.training_type_plugin.validation_step(*args)
  File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 161, in validation_step
    return self.lightning_module.validation_step(*args, **kwargs)
  File "/workspace/rebel/src/pl_modules.py", line 324, in validation_step
    outputs['predictions'], outputs['labels'] = self.generate_triples(batch, labels)
  File "/workspace/rebel/src/pl_modules.py", line 189, in generate_triples
    generated_tokens = self.model.generate(
  File "/opt/conda/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 26, in decorate_context
    return func(*args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/transformers/generation_utils.py", line 1344, in generate
    return self.beam_search(
  File "/opt/conda/lib/python3.8/site-packages/transformers/generation_utils.py", line 2192, in beam_search
    outputs = self(
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/transformers/models/bart/modeling_bart.py", line 1348, in forward
    outputs = self.model(
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/transformers/models/bart/modeling_bart.py", line 1235, in forward
    decoder_outputs = self.decoder(
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/transformers/models/bart/modeling_bart.py", line 1093, in forward
    layer_outputs = decoder_layer(
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/transformers/models/bart/modeling_bart.py", line 415, in forward
    hidden_states, self_attn_weights, present_key_value = self.self_attn(
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/transformers/models/bart/modeling_bart.py", line 279, in forward
    attn_output = self.out_proj(attn_output)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 93, in forward
    return F.linear(input, self.weight, self.bias)
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/functional.py", line 1692, in linear
    output = input.matmul(weight.t())
RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 15.75 GiB total capacity; 14.62 GiB already allocated; 22.69 MiB free; 14.69 GiB reserved in total by PyTorch)

I am using an NVIDIA Tesla T4 GPU with 16 GB of memory.

Am I missing something? Can someone help?

spaCy not working

I get this error when running the code in the readme:

TypeError: Can't convert {'output_ids': [[[0, 50267, 2974, 5269, 14555, 1437, 50266, 4612, 1437, 50265, 2034, 11, 5, 6833, 15752, 10014, 1437, 50266, 2809, 1437, 50265, 247, 1437, 50267, 4612, 1437, 50266, 2974, 5269, 14555, 1437, 50265, 6308, 6833, 15752, 10014, 1437, 50266, 2809, 1437, 50265, 247, 2]]]} to Sequence

No predictions

Hi, I am recreating what you did, but in Dutch. The training script is running, but the model does not predict a thing, even after several epochs. Additionally, the loss does not change at all.

image

What I changed to the config files is the paths to my data and the name of the models (from bart-large to mbart-50). Did I miss anything? The data is generated by Crocodile and has the correct structure.

I hope you have some time to take a look.

problem with model_saving.py

Hi,
I used your train.py script to train rebel on the docred dataset.
When I try to save my model using model_saving.py to use it in transformers I get the following error:
Traceback (most recent call last): File "model_saving.py", line 27, in <module> model = pl_module.load_from_checkpoint(checkpoint_path = 'outputs/2022-09-02/07-42-36/experiments/docred/last.ckpt', config = config, tokenizer = tokenizer, model = model) File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/pytorch_lightning/core/saving.py", line 157, in load_from_checkpoint checkpoint[cls.CHECKPOINT_HYPER_PARAMS_KEY].update(kwargs) File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/_collections_abc.py", line 832, in update self[key] = other[key] File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/omegaconf/dictconfig.py", line 258, in __setitem__ self._format_and_raise( File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/omegaconf/base.py", line 95, in _format_and_raise format_and_raise( File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/omegaconf/_utils.py", line 694, in format_and_raise _raise(ex, cause) File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/omegaconf/_utils.py", line 610, in _raise raise ex # set end OC_CAUSE=1 for full backtrace omegaconf.errors.ConfigKeyError: 'str' object has no attribute '__dict__' full_key: config reference_type=Optional[Dict[Union[str, Enum], Any]] object_type=dict

Can I configure it manually?
The conf file in conf = omegaconf.OmegaConf.load gets read correctly and I could transfer the values manually.

Evaluation on DocRED

Hi,
Thanks for your enlightening work! I have one question regarding the evaluation metrics on DocRED:
since DocRED has a different format of data from other joint RE dataset, i.e. one entity E_i consists of several mentions (m_i,1, m_i,2, ...), I'm wondering how do you compute the F1 metrics? Do you simply treat relations as (mention_head, relation, mention_tail) and convert the label of entity pair's relation to mention pair's relation (just like the problem definition of sentence-level RE), or you consider the format of document-level RE (i.e. every entity must strictly consist of all gold mentions)?
Looking forward to your reply😊

Different results when using the same model from different sources

Hello, I was thinking of using the REBEL model which is already trained to extract relationships from DBpedia abstracts.

Following the readme.md of this repository I could observe that the same model can be used from several sources, however when testing it I get different results:

Input text: "Barack Hussein Obama II is an American politician who is the 44th and current President of the United States. He is the first African American to hold the office and the first president born outside the continental United States. Born in Honolulu, Hawaii, Obama is a graduate of Columbia University and Harvard Law School, where he was president of the Harvard Law Review. He was a community organizer in Chicago before earning his law degree. He worked as a civil rights attorney and taught constitutional law at the University of Chicago Law School between 1992 and 2004. While serving three terms representing the 13th District in the Illinois Senate from 1997 to 2004, he ran unsuccessfully in the Democratic primary for the United States Hou"

When using the hugging faces model as you have described in the readme.md

from transformers import pipeline
triplet_extractor = pipeline('text2text-generation', model='Babelscape/rebel-large', tokenizer='Babelscape/rebel-large')
...

I get the following relationships:

Barack Hussein Obama II|educated at|Columbia University
President of the United States|officeholder|Barack Hussein Obama II
Barack Hussein Obama II|position held|President of the United States
Barack Hussein Obama II|educated at|Harvard Law School

Furthermore, when I use the model that I downloaded from the link in the readme.md and include it in the Spacy pipeline I get:

nlp = spacy.load("en_core_web_sm")
nlp.add_pipe("rebel", after="senter", config={
    'device':-1, 'model_name':'Babelscape/rebel-large'})
...

community organizer|located in the administrative territorial entity|Chicago
University of Chicago Law School|field of work|constitutional law
Barack Hussein Obama II|position held|President of the United States
United States|ethnic group|African American
Harvard Law Review|publisher|Harvard Law School
President of the United States|officeholder|Barack Hussein Obama II
Illinois Senate|country|United States Hou
Democratic primary|country|United States Hou

I also tried with the code snippet that appears in the hugging faces page, the one that appears at the bottom with the name 'Model and Tokenizer using transformers'.

...
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("Babelscape/rebel-large")
model = AutoModelForSeq2SeqLM.from_pretrained("Babelscape/rebel-large")
gen_kwargs = {"max_length": 256,"length_penalty": 0,"num_beams": 3,"num_return_sequences": 3}
...

I got this results:

Barack Hussein Obama|position held|President of the United States
Barack Hussein Obama II|educated at|Harvard Law School
Barack Hussein Obama II|educated at|Columbia University
Barack Hussein Obama|educated at|Harvard Law School
President of the United States|officeholder|Barack Hussein Obama
Barack Hussein Obama II|place of birth|Honolulu, Hawaii
Barack Hussein Obama|educated at|Columbia University
Harvard Law Review|publisher|Harvard Law School
Barack Hussein Obama II|position held|President of the United States
President of the United States|officeholder|Barack Hussein Obama II

Maybe this is a silly question but, why do I get different behaviors if the model is the same? Am I doing something wrong?

Do you know if applying some kind of preprocessing to the input text would improve the result of the model or does it work better without tweaking the input?

Thanks!

New line at end of input text (spacy_component) causes issues

Seems that when using a spacy pipeline, a new line at the end of the input text causes the model to return a head and tail that are not in the original text. (For example, using the example input_sentence about Barcelona from the README and adding a newline to the sentence, returns an additional relation with head of '2016 Summer Olympics' (i.e. related to Barcelona but not in the text).

This causes a further issue with the offset calculation here:

offset = (head_span.start, tail_span.start)
because head_span.start doesn't exist in this case.

A possible fix for the second issue is:

   def set_annotations(self, doc: Doc, triplets: List[dict]):
        """
        The function takes a spacy Doc object and a list of triplets (dictionaries) as input.
        For each triplet, it finds the substring in the Doc object that matches the head and tail of the triplet.
        It then creates a spacy span object for each of the head and tail.
        Finally, it creates a dictionary of the relation type, head span and tail span and adds it to the Doc object

        :param doc: the spacy Doc object
        :type doc: Doc
        :param triplets: List[dict]
        :type triplets: List[dict]
        """
        for triplet in triplets:
            # get substring to spacy span
            head_span = re.search(triplet["head"], doc.text)
            tail_span = re.search(triplet["tail"], doc.text)
            # get spacy span
            head_span_start = tail_span_start = -1
            if head_span is not None:
                head_span = doc.char_span(head_span.start(), head_span.end())
                head_span_start = head_span.start
            else:
                head_span = triplet["head"]
            if tail_span is not None:
                tail_span = doc.char_span(tail_span.start(), tail_span.end())
                tail_span_start = tail_span.start
            else:
                tail_span = triplet["tail"]
            offset = (head_span_start, tail_span_start)
            if offset not in doc._.rel:
                doc._.rel[offset] = {
                    "relation": triplet["type"],
                    "head_span": head_span,
                    "tail_span": tail_span,
                }

Some weird results

I'm testing the library, and I'm getting some weird results:

for the following input:

However, for much of the later part of his career, he worked on two ultimately unsuccessful endeavors. First, despite his great contributions to quantum mechanics, he opposed what it evolved into, objecting that nature "does not play dice". Second, he attempted to devise a unified field theory by generalizing his geometric theory of gravitation to include electromagnetism. As a result, he became increasingly isolated from the mainstream of modern physics.

I get the following results:

{'relation': 'member of', 'head_span': 'John Wayne', 'tail_span': 'The Doors'}
{'relation': 'member of', 'head_span': 'John Wayne', 'tail_span': 'The Rolling Stones'}
{'relation': 'has part', 'head_span': 'The Doors', 'tail_span': 'John Wayne'}
{'relation': 'has part', 'head_span': 'The Doors', 'tail_span': 'John Wayne'}
{'relation': 'has part', 'head_span': 'The Doors', 'tail_span': 'John Wayne'}
{'relation': 'facet of', 'head_span': 'opposed what it evolved into', 'tail_span': 'quantum mechanics'}
{'relation': 'has part', 'head_span': 'field theory', 'tail_span': 'electromagnetism'}
{'relation': 'part of', 'head_span': 'electromagnetism', 'tail_span': 'field theory'}
{'relation': 'different from', 'head_span': 'modern physics', 'tail_span': 'mainstream'}
{'relation': 'different from', 'head_span': 'mainstream', 'tail_span': 'modern physics'}

It seems that the model has issues with pronouns... When I run coreference resolution before, John Wayne and the Doors don't appear :)

when train docred issue!

hi,
when training docred, when i used one gpu , it have CUDA out of memory.
so i used tow gpus

I modified the gpus of docred_train.yaml as follows
gpus: [0, 1]

but it doesn't work.

(REBEL) kdk@Z370-Pro4:~/rebel/src$ HYDRA_FULL_ERROR=1 python train.py model=rebel_model data=docred_data train=docred_train
Global seed set to 42
[2022-04-01 14:17:19,346][datasets.builder][WARNING] - Using custom data configuration default-5a4186527af1c5d2
[2022-04-01 14:17:19,346][datasets.builder][WARNING] - Reusing dataset doc_red (/home/kdk/.cache/huggingface/datasets/doc_red/default-5a4186527af1c5d2/0.0.0/2cc6999b276b6aa2b2af5101b416c33155e5f19e6f0b26864a2312d1aa57b175)
/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/pytorch_lightning/utilities/distributed.py:50: UserWarning: You requested multiple GPUs but did not specify a backend, e.g. Trainer(accelerator="dp"|"ddp"|"ddp2"). Setting accelerator="ddp_spawn" for you.
warnings.warn(*args, **kwargs)
GPU available: True, used: True
TPU available: None, using: 0 TPU cores
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]
Using native 16bit precision.
[2022-04-01 14:17:20,609][datasets.arrow_dataset][WARNING] - Loading cached processed dataset at /home/kdk/rebel/data/datasets/docred_joint/train_joint.jsondocred_typed.cache
[2022-04-01 14:17:21,774][datasets.arrow_dataset][WARNING] - Loading cached processed dataset at /home/kdk/rebel/data/datasets/docred_joint/dev_joint.jsondocred_typed.cache
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using tokenizers before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using tokenizers before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using tokenizers before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
Global seed set to 42
/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/pytorch_lightning/utilities/distributed.py:50: UserWarning: MASTER_ADDR environment variable is not defined. Set as localhost
warnings.warn(*args, **kwargs)
initializing ddp: GLOBAL_RANK: 0, MEMBER: 1/2
Global seed set to 42
initializing ddp: GLOBAL_RANK: 1, MEMBER: 2/2
Traceback (most recent call last):
File "train.py", line 151, in
main()
File "/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/hydra/main.py", line 37, in decorated_main
strict=strict,
File "/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/hydra/_internal/utils.py", line 347, in _run_hydra
lambda: hydra.run(
File "/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/hydra/_internal/utils.py", line 201, in run_and_report
raise ex
File "/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/hydra/_internal/utils.py", line 198, in run_and_report
return func()
File "/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/hydra/_internal/utils.py", line 350, in
overrides=args.overrides,
File "/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/hydra/_internal/hydra.py", line 112, in run
configure_logging=with_log_configuration,
File "/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/hydra/core/utils.py", line 127, in run_job
ret.return_value = task_function(task_cfg)
File "train.py", line 147, in main
train(conf)
File "train.py", line 143, in train
trainer.fit(pl_module, datamodule=pl_data_module)
File "/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 510, in fit
results = self.accelerator_backend.train()
File "/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/pytorch_lightning/accelerators/ddp_spawn_accelerator.py", line 83, in train
mp.spawn(self.ddp_train, nprocs=self.nprocs, args=(self.mp_queue, model,))
File "/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 199, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 157, in start_processes
while not context.join():
File "/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 118, in join
raise Exception(msg)
Exception:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
fn(i, *args)
File "/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/pytorch_lightning/accelerators/ddp_spawn_accelerator.py", line 174, in ddp_train
self.trainer.setup_trainer(model)
File "/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 442, in setup_trainer
self.logger.log_hyperparams(ref_model.hparams_initial)
File "/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/pytorch_lightning/utilities/distributed.py", line 40, in wrapped_fn
return fn(*args, **kwargs)
File "/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/pytorch_lightning/loggers/wandb.py", line 170, in log_hyperparams
self.experiment.config.update(params, allow_val_change=True)
File "/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/pytorch_lightning/loggers/base.py", line 39, in experiment
return get_experiment() or DummyExperiment()
File "/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/pytorch_lightning/utilities/distributed.py", line 40, in wrapped_fn
return fn(*args, **kwargs)
File "/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/pytorch_lightning/loggers/base.py", line 38, in get_experiment
return fn(self)
File "/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/pytorch_lightning/loggers/wandb.py", line 152, in experiment
id=self._id, resume='allow', **self._kwargs) if wandb.run is None else wandb.run
File "/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 741, in init
wi.setup(kwargs)
File "/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/wandb/sdk/wandb_init.py", line 155, in setup
wandb_login._login(anonymous=anonymous, force=force, _disable_warning=True)
File "/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/wandb/sdk/wandb_login.py", line 210, in _login
wlogin.prompt_api_key()
File "/home/kdk/anaconda3/envs/REBEL/lib/python3.7/site-packages/wandb/sdk/wandb_login.py", line 147, in prompt_api_key
raise UsageError("api_key not configured (no-tty). Run wandb login")
wandb.errors.UsageError: api_key not configured (no-tty). Run wandb login

Do you know this issue?
I want to solve this problem!!

thank you.

Comparison with other works?

Hi! Thanks for sharing this awesome work, it's been interesting to explore your approach.

I have a question though.

If I had an approach that assumed NER was solved and only classified the relations between all entities in the input, is it still fair to compare to your results with REBEL (or other models!) for datasets like CONLL-04? With CONLL-04, I noticed all results on paperswithcode perform joint NER and RE. But for a dataset like DocRED, doing the joint task is far less common.

Just looking for some opinions from others in the field, thanks! 👍🏻

KeyError: 'labels' in generate_samples.py; doc_red

Hi,
I am trying to train the model on the doc red dataset in order to test the effects of labeling the entities with an additional special token.

At the moment I am still trying to get the code to run with the original dataset.

In the first epoch after 56% i get the KeyError: 'labels' in line 48, in on_train_batch_end labels = batch.pop("labels")

I checked the dataset for empty labels and found 27 empty arrays in the doc red data.
Deleting data points didn't solve the problem.
I also tested only using the first 50% of the dataset.
The error still occurred at 56%.

full console output with print(batch) before the error:

(azureml_py38_PT_TF) azureuser@rebelgpu:/mnt/batch/tasks/shared/LS_root/mounts/clusters/rebelgpu/code/Users/leon.lukas/rebel-main/src$ python train.py 
Extension horovod.torch has not been built: /anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/horovod/torch/mpi_lib/_mpi_lib.cpython-38-x86_64-linux-gnu.so not found
If this is not expected, reinstall Horovod with HOROVOD_WITH_PYTORCH=1 to debug the build error.
Warning! MPI libs are missing, but python applications are still available.
Global seed set to 42
Special tokens have been added in the vocabulary, make sure the associated word embedding are fine-tuned or trained.
[2022-08-29 11:47:49,710][datasets.builder][WARNING] - Using custom data configuration default-3b456a334ae5426f
[2022-08-29 11:47:49,710][datasets.builder][WARNING] - Reusing dataset doc_red (/home/azureuser/.cache/huggingface/datasets/doc_red/default-3b456a334ae5426f/0.0.0/2cc6999b276b6aa2b2af5101b416c33155e5f19e6f0b26864a2312d1aa57b175)
GPU available: True, used: True
TPU available: None, using: 0 TPU cores
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Using native 16bit precision.
[2022-08-29 11:47:50,688][datasets.arrow_dataset][WARNING] - Loading cached processed dataset at /mnt/batch/tasks/shared/LS_root/mounts/clusters/rebelgpu/code/Users/leon.lukas/rebel-main/data/doc_red/train_annotated.jsondocred_typed.cache
[2022-08-29 11:47:51,828][datasets.arrow_dataset][WARNING] - Loading cached processed dataset at /mnt/batch/tasks/shared/LS_root/mounts/clusters/rebelgpu/code/Users/leon.lukas/rebel-main/data/doc_red/dev.jsondocred_typed.cache
wandb: Currently logged in as: llukas (use `wandb login --relogin` to force relogin)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
wandb: wandb version 0.13.2 is available!  To upgrade, please run:
wandb:  $ pip install wandb --upgrade
wandb: Tracking run with wandb version 0.10.26
wandb: Syncing run bart-large
wandb: ⭐️ View project at https://wandb.ai/llukas/docred_typed
wandb: 🚀 View run at https://wandb.ai/llukas/docred_typed/runs/3b8sf4s1
wandb: Run data is saved locally in /mnt/batch/tasks/shared/LS_root/mounts/clusters/rebelgpu/code/Users/leon.lukas/rebel-main/src/outputs/2022-08-29/11-47-34/wandb/run-20220829_114753-3b8sf4s1
wandb: Run `wandb offline` to turn off syncing.


  | Name    | Type                         | Params
---------------------------------------------------------
0 | model   | BartForConditionalGeneration | 406 M 
1 | loss_fn | CrossEntropyLoss             | 0     
---------------------------------------------------------
406 M     Trainable params
0         Non-trainable params
406 M     Total params
/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/pytorch_lightning/utilities/distributed.py:50: UserWarning: The dataloader, val dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 8 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  warnings.warn(*args, **kwargs)
Validation sanity check:   0%|                                                                                                                                                                                                              | 0/2 [00:00<?, ?it/s]/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/transformers/generation_utils.py:1777: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  next_indices = next_tokens // vocab_size
Validation sanity check: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:11<00:00,  5.61s/it]RE Evaluation in *** STRICT *** mode
processed 16 sentences with 233 relations; found: 0 relations; correct: 0.
        ALL      TP: 0; FP: 0;  FN: 231
                (m avg): precision: 0.00;       recall: 0.00;   f1: 0.00 (micro)
                (M avg): precision: 0.00;       recall: 0.00;   f1: 0.00 (Macro)

        head of government:     TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        country:        TP: 0;  FP: 0;  FN: 64; precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        place of birth:         TP: 0;  FP: 0;  FN: 4;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        place of death:         TP: 0;  FP: 0;  FN: 1;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        father:         TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        mother:         TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        spouse:         TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        country of citizenship:         TP: 0;  FP: 0;  FN: 10; precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        continent:      TP: 0;  FP: 0;  FN: 2;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        instance of:    TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        head of state:  TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        capital:        TP: 0;  FP: 0;  FN: 3;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        official language:      TP: 0;  FP: 0;  FN: 1;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        position held:  TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        child:  TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        author:         TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        member of sports team:  TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        director:       TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        screenwriter:   TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        educated at:    TP: 0;  FP: 0;  FN: 5;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        composer:       TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        member of political party:      TP: 0;  FP: 0;  FN: 1;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        employer:       TP: 0;  FP: 0;  FN: 1;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        founded by:     TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        league:         TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        publisher:      TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        owned by:       TP: 0;  FP: 0;  FN: 1;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        located in the administrative territorial entity:       TP: 0;  FP: 0;  FN: 33; precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        genre:  TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        operator:       TP: 0;  FP: 0;  FN: 3;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        religion:       TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        contains administrative territorial entity:     TP: 0;  FP: 0;  FN: 27; precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        follows:        TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        followed by:    TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        headquarters location:  TP: 0;  FP: 0;  FN: 2;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        cast member:    TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        producer:       TP: 0;  FP: 0;  FN: 1;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        award received:         TP: 0;  FP: 0;  FN: 1;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        creator:        TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        parent taxon:   TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        ethnic group:   TP: 0;  FP: 0;  FN: 3;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        performer:      TP: 0;  FP: 0;  FN: 6;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        manufacturer:   TP: 0;  FP: 0;  FN: 14; precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        developer:      TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        series:         TP: 0;  FP: 0;  FN: 1;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        sister city:    TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        legislative body:       TP: 0;  FP: 0;  FN: 4;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        basin country:  TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        located in or next to body of wate/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/pytorch_lightning/utilities/distributed.py:50: UserWarning: The dataloader, train dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 8 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  warnings.warn(*args, **kwargs)
r:      TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        military branch:        TP: 0;  FP: 0;  FN: 2;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        record label:   TP: 0;  FP: 0;  FN: 2;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        production company:     TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        location:       TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        subclass of:    TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        subsidiary:     TP: 0;  FP: 0;  FN: 2;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        part of:        TP: 0;  FP: 0;  FN: 2;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        original language of work:      TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        platform:       TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        mouth of the watercourse:       TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        original network:       TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        member of:      TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        chairperson:    TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        country of origin:      TP: 0;  FP: 0;  FN: 1;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        has part:       TP: 0;  FP: 0;  FN: 1;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        residence:      TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        date of birth:  TP: 0;  FP: 0;  FN: 4;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        date of death:  TP: 0;  FP: 0;  FN: 4;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        inception:      TP: 0;  FP: 0;  FN: 3;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        dissolved, abolished or demolished:     TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        publication date:       TP: 0;  FP: 0;  FN: 4;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        start time:     TP: 0;  FP: 0;  FN: 1;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        end time:       TP: 0;  FP: 0;  FN: 1;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        point in time:  TP: 0;  FP: 0;  FN: 1;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        conflict:       TP: 0;  FP: 0;  FN: 4;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        characters:     TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        lyrics by:      TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        located on terrain feature:     TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        participant:    TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        influenced by:  TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        location of formation:  TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        parent organization:    TP: 0;  FP: 0;  FN: 3;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        notable work:   TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        separated from:         TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        narrative location:     TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        work location:  TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        applies to jurisdiction:        TP: 0;  FP: 0;  FN: 3;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        product or material produced:   TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        unemployment rate:      TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        territory claimed by:   TP: 0;  FP: 0;  FN: 3;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        participant of:         TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        replaces:       TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        replaced by:    TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        capital of:     TP: 0;  FP: 0;  FN: 2;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        languages spoken, written or signed:    TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        present in work:        TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
        sibling:        TP: 0;  FP: 0;  FN: 0;  precision: 0.00;        recall: 0.00;   f1: 0.00;       0
Epoch 0:   1%|█▋                                                                                                                                                                                           | 8/889 [00:02<04:25,  3.32it/s, loss=8.86, v_num=f4s1]/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/torch/optim/lr_scheduler.py:131: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
  warnings.warn("Detected call of `lr_scheduler.step()` before `optimizer.step()`. "
Epoch 0:  56%|████████████████████████████████████████████████████████████████████████████████████████████████████████▉                                                                                  | 499/889 [02:39<02:04,  3.12it/s, loss=5.25, v_num=f4s1]------------------------------------------------
{'attention_mask': tensor([[1, 1, 1,  ..., 0, 0, 0],
        [1, 1, 1,  ..., 0, 0, 0],
        [1, 1, 1,  ..., 0, 0, 0],
        [1, 1, 1,  ..., 1, 1, 1]], device='cuda:0'), 'input_ids': tensor([[    0, 29161,  2897,  ...,     1,     1,     1],
        [    0,   133,   494,  ...,     1,     1,     1],
        [    0, 47001,   329,  ...,     1,     1,     1],
        [    0,   113,  1890,  ...,   347,     4,     2]], device='cuda:0'), 'decoder_input_ids': tensor([[    0, 50267,  2897,  ...,     1,     1,     1],
        [    0, 50267,   496,  ...,     1,     1,     1],
        [    0, 50267, 18775,  ...,     1,     1,     1],
        [    0, 50267,  1890,  ..., 13034,  1437,     2]], device='cuda:0')}
<bound method BatchEncoding.keys of {'attention_mask': tensor([[1, 1, 1,  ..., 0, 0, 0],
        [1, 1, 1,  ..., 0, 0, 0],
        [1, 1, 1,  ..., 0, 0, 0],
        [1, 1, 1,  ..., 1, 1, 1]], device='cuda:0'), 'input_ids': tensor([[    0, 29161,  2897,  ...,     1,     1,     1],
        [    0,   133,   494,  ...,     1,     1,     1],
        [    0, 47001,   329,  ...,     1,     1,     1],
        [    0,   113,  1890,  ...,   347,     4,     2]], device='cuda:0'), 'decoder_input_ids': tensor([[    0, 50267,  2897,  ...,     1,     1,     1],
        [    0, 50267,   496,  ...,     1,     1,     1],
        [    0, 50267, 18775,  ...,     1,     1,     1],
        [    0, 50267,  1890,  ..., 13034,  1437,     2]], device='cuda:0')}>
Saving latest checkpoint...
Traceback (most recent call last):
  File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 561, in train
    self.train_loop.run_training_epoch()
  File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/pytorch_lightning/trainer/training_loop.py", line 556, in run_training_epoch
    self.on_train_batch_end(epoch_output, batch_end_outputs, batch, batch_idx, dataloader_idx)
  File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/pytorch_lightning/trainer/training_loop.py", line 226, in on_train_batch_end
    self.trainer.call_hook('on_train_batch_end', batch_end_outputs, batch, batch_idx, dataloader_idx)
  File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 925, in call_hook
    trainer_hook(*args, **kwargs)
  File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/pytorch_lightning/trainer/callback_hook.py", line 147, in on_train_batch_end
    callback.on_train_batch_end(self, self.get_model(), outputs, batch, batch_idx, dataloader_idx)
  File "/mnt/batch/tasks/shared/LS_root/mounts/clusters/rebelgpu/code/Users/leon.lukas/rebel-main/src/generate_samples.py", line 48, in on_train_batch_end
    labels = batch.pop("labels")
  File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/_collections_abc.py", line 795, in pop
    value = self[key]
  File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 230, in __getitem__
    return self.data[item]
KeyError: 'labels'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "train.py", line 111, in <module>
    main()
  File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/hydra/main.py", line 32, in decorated_main
    _run_hydra(
  File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/hydra/_internal/utils.py", line 346, in _run_hydra
    run_and_report(
  File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/hydra/_internal/utils.py", line 201, in run_and_report
    raise ex
  File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/hydra/_internal/utils.py", line 198, in run_and_report
    return func()
  File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/hydra/_internal/utils.py", line 347, in <lambda>
    lambda: hydra.run(
  File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 107, in run
    return run_job(
  File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/hydra/core/utils.py", line 127, in run_job
    ret.return_value = task_function(task_cfg)
  File "train.py", line 107, in main
    train(conf)
  File "train.py", line 103, in train
    trainer.fit(pl_module, datamodule=pl_data_module)
  File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 510, in fit
    results = self.accelerator_backend.train()
  File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 57, in train
    return self.train_or_test()
  File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 74, in train_or_test
    results = self.trainer.train()
  File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 592, in train
    self.train_loop.on_train_end()
  File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/pytorch_lightning/trainer/training_loop.py", line 156, in on_train_end
    self.check_checkpoint_callback(should_save=True, is_last=True)
  File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/pytorch_lightning/trainer/training_loop.py", line 190, in check_checkpoint_callback
    callback.on_validation_end(self.trainer, model)
  File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 204, in on_validation_end
    self.save_checkpoint(trainer, pl_module)
  File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 239, in save_checkpoint
    self._validate_monitor_key(trainer)
  File "/anaconda/envs/azureml_py38_PT_TF/lib/python3.8/site-packages/pytorch_lightning/callbacks/model_checkpoint.py", line 517, in _validate_monitor_key
    raise MisconfigurationException(m)
pytorch_lightning.utilities.exceptions.MisconfigurationException: ModelCheckpoint(monitor='val_F1_micro') not found in the returned metrics: ['loss']. HINT: Did you call self.log('val_F1_micro', tensor) in the LightningModule?

wandb: Waiting for W&B process to finish, PID 13914
wandb: Program failed with code 1.  Press ctrl-c to abort syncing.
wandb:                                                                                
wandb: Find user logs for this run at: /mnt/batch/tasks/shared/LS_root/mounts/clusters/rebelgpu/code/Users/leon.lukas/rebel-main/src/outputs/2022-08-29/11-47-34/wandb/run-20220829_114753-3b8sf4s1/logs/debug.log
wandb: Find internal logs for this run at: /mnt/batch/tasks/shared/LS_root/mounts/clusters/rebelgpu/code/Users/leon.lukas/rebel-main/src/outputs/2022-08-29/11-47-34/wandb/run-20220829_114753-3b8sf4s1/logs/debug-internal.log
wandb: Run summary:
wandb:   lr-AdamW/pg1 0.0
wandb:   lr-AdamW/pg2 0.0
wandb:           loss 5.46171
wandb:          epoch 0
wandb:       _runtime 174
wandb:     _timestamp 1661773847
wandb:          _step 49
wandb: Run history:
wandb:   lr-AdamW/pg1 ▁
wandb:   lr-AdamW/pg2 ▁
wandb:           loss ▁
wandb:          epoch ▁
wandb:       _runtime ▁
wandb:     _timestamp ▁
wandb:          _step ▁
wandb: 
wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s)

spacy_component

Hi,
for me spacy integration only worked after the following fix:

$ diff spacy_component.py fix_spacy_component.py
71c71
< extracted_text = self.triplet_extractor.tokenizer.batch_decode(output_ids[0])

      extracted_text = self.triplet_extractor.tokenizer.batch_decode(output_ids)

(the decode function assumes a list of lists, not a list of ids, it wont raise an error but only first token is processed in the buggy version)

Error while executing setup

Looks like the dependencies have drifted some:

ERROR: Cannot install -r requirements.txt (line 4) and -r requirements.txt (line 7) because these package versions have conflicting dependencies.

The conflict is caused by:
    transformers 4.19.2 depends on huggingface-hub<1.0 and >=0.1.0
    datasets 1.3.0 depends on huggingface-hub==0.0.2

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/user_guide/#fixing-conflicting-dependencies
WARNING: You are using pip version 21.1.1; however, version 22.3.1 is available.
You should consider upgrading via the '/usr/local/opt/[email protected]/bin/python3.9 -m pip install --upgrade pip' command.

Can't convert output_ids

Hello Dear,
i am trying to use REBEL to extract relations from text and i followed the same instructions in the README file but i got the below error
TypeError: Can't convert {'output_ids': [[0, 50267, 221, 20339, 2615, 102, 1437, 50266, 1587, 7330, 1073, 13249, 493, 16517, 1437, 50265, 2034, 11, 5, 6833, 15752, 10014, 1437, 50266, 18978, 3497, 1437, 50265, 247, 1437, 50267, 19664, 1780, 219, 1437, 50266, 1587, 7330, 1073, 13249, 493, 16517, 1437, 50265, 2034, 11, 5, 6833, 15752, 10014, 1437, 50266, 18978, 3497, 1437, 50265, 247, 1437, 50267, 1587, 7330, 1073, 13249, 493, 16517, 1437, 50266, 18978, 3497, 1437, 50265, 247, 1437, 50267, 18978, 3497, 1437, 50266, 1587, 7330, 1073, 13249, 493, 16517, 1437, 50265, 6308, 6833, 15752, 10014, 2]]} to Sequence
and this is my code

from transformers import pipeline

triplet_extractor = pipeline('text2text-generation', model='Babelscape/rebel-large', tokenizer='Babelscape/rebel-large')

# We need to use the tokenizer manually since we need special tokens.
extracted_text = triplet_extractor.tokenizer.batch_decode([triplet_extractor("Punta Cana is a resort town in the municipality of Higuey, in La Altagracia Province, the eastern most province of the Dominican Republic", return_tensors=True, return_text=False)[0]["generated_token_ids"]])

print(extracted_text[0])

# Function to parse the generated text and extract the triplets
def extract_triplets(text):
    triplets = []
    relation, subject, relation, object_ = '', '', '', ''
    text = text.strip()
    current = 'x'
    for token in text.replace("<s>", "").replace("<pad>", "").replace("</s>", "").split():
        if token == "<triplet>":
            current = 't'
            if relation != '':
                triplets.append({'head': subject.strip(), 'type': relation.strip(),'tail': object_.strip()})
                relation = ''
            subject = ''
        elif token == "<subj>":
            current = 's'
            if relation != '':
                triplets.append({'head': subject.strip(), 'type': relation.strip(),'tail': object_.strip()})
            object_ = ''
        elif token == "<obj>":
            current = 'o'
            relation = ''
        else:
            if current == 't':
                subject += ' ' + token
            elif current == 's':
                object_ += ' ' + token
            elif current == 'o':
                relation += ' ' + token
    if subject != '' and relation != '' and object_ != '':
        triplets.append({'head': subject.strip(), 'type': relation.strip(),'tail': object_.strip()})
    return triplets
extracted_triplets = extract_triplets(extracted_text[0])
print(extracted_triplets)

so how can i solve this please?!

Architecture of REBEL model

Hello,
Thank you so much for your work. Could you please upload the model architecture? or is it same as transformer architecture?

No improvement with pre-trained REBEL model on CONLL04

Thanks for your work. I tried to train on CONLL04, and it works as expected with BART, but with the REBEL-Large model, all the scores stay at 0.
I know there is an issue with transformers 4.4.0, and my experiments were conducted with pytorch 1.7.1 and transformers 4.12.4 (no change to datasets or any other package).
Have I been doing something wrong? If there is no other solution, is it possible to get an updated version of the model? If not, could I ask for the finetuned CONLL04 model with REBEL pretraining?

Getting conflict between transformers and datasets.

Hi,
I was trying to train your model with nyt data using following commands:
pip install -r requirements.txt
python src/train.py data=nyt_data train=nyt_train

I got following error.
ERROR: Cannot install -r requirements.txt (line 4) and -r requirements.txt (line 7) because these package versions have conflicting dependencies.
The conflict is caused by:
transformers 4.19.2 depends on huggingface-hub<1.0 and >=0.1.0
datasets 1.3.0 depends on huggingface-hub==0.0.2

Can you please help me to resolve this issue?
Regards,
Samiran

Pytorch shape mismatch

When I try to run model_saving.py to save the model in a hf transformers format, I get the following error and am not sure how to resolve this. Is there an issue with my training, or is one of my packages incompatible? Thank you for your help!

 python rebel/src/model_saving.py

File "rebel/src/model_saving.py", line 25, in <module>
    model = pl_module.load_from_checkpoint(checkpoint_path = 'rebel/outputs/2022-10-17/08-41-38/experiments/docred/epoch=13-step=1315.ckpt', config = config, tokenizer = tokenizer, model = model)
  File "/anaconda/envs/worker_venv/lib/python3.8/site-packages/pytorch_lightning/core/saving.py", line 159, in load_from_checkpoint
    model = cls._load_model_state(checkpoint, strict=strict, **kwargs)
  File "/anaconda/envs/worker_venv/lib/python3.8/site-packages/pytorch_lightning/core/saving.py", line 205, in _load_model_state
    model.load_state_dict(checkpoint['state_dict'], strict=strict)
  File "/anaconda/envs/worker_venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1604, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for BasePLModule:
        size mismatch for model.final_logits_bias: copying a param with shape torch.Size([1, 50278]) from checkpoint, the shape in current model is torch.Size([1, 50268]).
        size mismatch for model.model.shared.weight: copying a param with shape torch.Size([50278, 1024]) from checkpoint, the shape in current model is torch.Size([50268, 1024]).
        size mismatch for model.model.encoder.embed_tokens.weight: copying a param with shape torch.Size([50278, 1024]) from checkpoint, the shape in current model is torch.Size([50268, 1024]).
        size mismatch for model.model.decoder.embed_tokens.weight: copying a param with shape torch.Size([50278, 1024]) from checkpoint, the shape in current model is torch.Size([50268, 1024]).
        size mismatch for model.lm_head.weight: copying a param with shape torch.Size([50278, 1024]) from checkpoint, the shape in current model is torch.Size([50268, 1024])```

packages installed:
absl-py                 1.3.0
aiohttp                 3.8.3
aiosignal               1.2.0
altair                  4.2.0
antlr4-python3-runtime  4.8
arrow                   1.2.3
astor                   0.8.1
async-timeout           4.0.2
attrs                   22.1.0
backports.zoneinfo      0.2.1
base58                  2.1.1
blinker                 1.5
bravado                 11.0.3
bravado-core            5.17.1
cachetools              5.2.0
certifi                 2022.9.24
charset-normalizer      2.1.1
click                   7.1.2
configparser            5.3.0
datasets                1.3.0
decorator               5.1.1
dill                    0.3.5.1
docker-pycreds          0.4.0
entrypoints             0.4
filelock                3.8.0
fqdn                    1.5.1
frozenlist              1.3.1
fsspec                  2022.8.2
future                  0.18.2
gitdb                   4.0.9
GitPython               3.1.29
google-auth             2.12.0
google-auth-oauthlib    0.4.6
grpcio                  1.49.1
huggingface-hub         0.10.1
hydra-core              1.0.6
idna                    3.4
importlib-metadata      5.0.0
importlib-resources     5.10.0
isoduration             20.11.0
Jinja2                  3.1.2
joblib                  1.2.0
jsonpointer             2.3
jsonref                 0.3.0
jsonschema              4.16.0
Markdown                3.4.1
MarkupSafe              2.1.1
monotonic               1.6
msgpack                 1.0.4
multidict               6.0.2
multiprocess            0.70.13
neptune-client          0.5.1
nltk                    3.7
numpy                   1.23.4
oauthlib                3.2.1
omegaconf               2.0.6
packaging               21.3
pandas                  1.5.0
pathtools               0.1.2
Pillow                  9.2.0
pip                     22.2.2
pkgutil_resolve_name    1.3.10
portalocker             2.5.1
promise                 2.3
protobuf                3.19.6
psutil                  5.8.0
pyarrow                 9.0.0
pyasn1                  0.4.8
pyasn1-modules          0.2.8
pydeck                  0.8.0b4
pyDeprecate             0.3.2
PyJWT                   2.5.0
pyparsing               3.0.9
pyrsistent              0.18.1
python-dateutil         2.8.2
pytorch-lightning       1.1.7
pytz                    2022.4
pytz-deprecation-shim   0.1.0.post0
PyYAML                  6.0
regex                   2022.9.13
requests                2.28.1
requests-oauthlib       1.3.1
rfc3339-validator       0.1.4
rfc3987                 1.3.8
rouge-score             0.0.4
rsa                     4.9
sacrebleu               1.5.0
sentry-sdk              1.9.10
setuptools              63.4.1
shortuuid               1.0.9
simplejson              3.17.6
six                     1.16.0
smmap                   5.0.0
streamlit               0.82.0
subprocess32            3.5.4
swagger-spec-validator  2.7.6
tensorboard             2.10.1
tensorboard-data-server 0.6.1
tensorboard-plugin-wit  1.8.1
tokenizers              0.12.1
toml                    0.10.2
toolz                   0.12.0
torch                   1.12.1
torchmetrics            0.10.0
tornado                 6.2
tqdm                    4.64.1
transformers            4.23.1
typing_extensions       4.4.0
tzdata                  2022.5
tzlocal                 4.2
uri-template            1.2.0
urllib3                 1.26.12
validators              0.20.0
wandb                   0.10.26
watchdog                2.1.9
webcolors               1.12
websocket-client        1.4.1
Werkzeug                2.2.2
wheel                   0.37.1
xxhash                  3.0.0
yarl                    1.8.1
zipp                    3.9.0

How to predict entity types

Thx for your excellent work.
I read your paper and skipped codes, but I can't find when and where REBEL predicts entity types.
I see the paper reported Strict Evaluation, so I think I missed something. Can u point the corresponding codes for me?

Can this method serves to Chinese dataset?

Since BART-large model pre-trained on the English language, and the datasets in the paper are all English, I wondered whether this method serves to Chinese dataset while Chinese relation extraction datasets are also scarce and often small? If yes, how to apply the specific transform?

Other languages support

Hey, outstanding paper you got. I mentioned it in my Thesis :)

I was trying to train it on Russian language. I've generated the dataset with CROCODILE. Edited config files. Downloaded facebook/mbart-50-large-50. Added it to conf files as well. Additionally, I added src_lang and tgt_lang parameter that equal ru_RU in AutoTokenizer in train.py

I struggled with numerous errors in rebel_short.py, pl_modules.py, pl_data_modules.py that happend on my setup. I also had to update pytorch-lightning several version up - to 1.3.0 because the version in requirements.txt got a problem at startup. May be it relates to #22

So, for now I have several questions that you as creators:

  1. Should there be any problem with replacing bart with mbart-50?
  2. What would you recommend to check if I see that a printed log with all my relations and their TP FP FN precision recall and F1 contains only zeros and the loss is nan?
  3. In generate_samples.py there is a line

pl_module.logger.experiment.log({"Triplets": wandb_table})

when everything is done and the training started at 999 iteration an exception is thrown there which says SummaryWriter() has no attribute log. I've checked the docs of pl and it does not really tell where should I look at least.

I put it try except block, but I worry that this may lead to 4th question

  1. At 50% of training it stops with a RuntimeError 'could not infer dtype of Table' )))0

I am not sure if stacktraces will help here, they are not really informative.

Would love to get any answer here. Thanks again for the paper.

Whats the token for token_id 50265?

In the BasePLModule, in line 217 and 231, a variable is storing a boolean matrix with condition that labels==50265 and generated_tokens==50265. I would like to know in your repo whats the token for token id 50265.

Fine-Tuning with custom dataset

Hi! I was trying to fine-tune the model with my custom dataset.
I did not find anything and could not do this operation by modifying the config hydra file inside the directory. Is there a way to perform this operation?

IndexError: list index out of range

Hello,

I am trying to train a new model on a Dutch dataset. For this I have followed the instructions of the crocodile repository (except the last steps in which relations are filtered since the models do not work for Dutch).
Besides that I have created my own relations_count.tsv file. Further I have created my own data config file.

I keep getting a "IndexError: list index out of range" error. I hope you can help me!

The config file:

`# @package _global_

num_workers: 8
dataset_name: 'C:\Users\kbrekelm\Desktop\python_projects\rebel/datasets/rebel-short.py'
text_column: 'context'
target_column: 'triplets'
train_file: 'C:\Users\kbrekelm\Desktop\python_projects\crocodile\out\nl\rebel_0-5000.jsonl'
validation_file: 'C:\Users\kbrekelm\Desktop\python_projects\crocodile\out\nl/rebel_5000-10000.jsonl'
test_file: 'C:\Users\kbrekelm\Desktop\python_projects\crocodile\out\nl\rebel_10000-15000.jsonl'
overwrite_cache: False
preprocessing_num_workers: 
max_source_length: 256
max_target_length: 128
val_max_target_length: 128
pad_to_max_length: False
max_train_samples:
max_val_samples: 
max_test_samples: 
num_beams: 
eval_beams: 3
ignore_pad_token_for_loss: True
source_prefix: 
relations_file: 'C:\Users\kbrekelm\Desktop\python_projects\rebel\data\relations_count.tsv'
#relations_file: 'C:\Users\kbrekelm\Desktop\python_projects\rebel\data\nlrebel_relations.tsv'`

The command:
python train.py model=default_model data=nlrebel train=nlrebel_train

The error:
Traceback (most recent call last): File "train.py", line 111, in <module> main() File "C:\Users\kbrekelm\anaconda3\envs\knowledge\lib\site-packages\hydra\main.py", line 37, in decorated_main strict=strict, File "C:\Users\kbrekelm\anaconda3\envs\knowledge\lib\site-packages\hydra\_internal\utils.py", line 347, in _run_hydra lambda: hydra.run( File "C:\Users\kbrekelm\anaconda3\envs\knowledge\lib\site-packages\hydra\_internal\utils.py", line 201, in run_and_report raise ex File "C:\Users\kbrekelm\anaconda3\envs\knowledge\lib\site-packages\hydra\_internal\utils.py", line 198, in run_and_report return func() File "C:\Users\kbrekelm\anaconda3\envs\knowledge\lib\site-packages\hydra\_internal\utils.py", line 350, in <lambda> overrides=args.overrides, File "C:\Users\kbrekelm\anaconda3\envs\knowledge\lib\site-packages\hydra\_internal\hydra.py", line 112, in run configure_logging=with_log_configuration, File "C:\Users\kbrekelm\anaconda3\envs\knowledge\lib\site-packages\hydra\core\utils.py", line 127, in run_job ret.return_value = task_function(task_cfg) File "train.py", line 107, in main train(conf) File "train.py", line 55, in train pl_data_module = BasePLDataModule(conf, tokenizer, model) File "C:\Users\kbrekelm\anaconda3\envs\knowledge\lib\site-packages\pytorch_lightning\core\datamodule.py", line 49, in __call__ obj = type.__call__(cls, *args, **kwargs) File "C:\Users\kbrekelm\Desktop\python_projects\rebel\src\pl_data_modules.py", line 66, in __init__ self.datasets = load_dataset(conf.dataset_name, data_files={'train': conf.train_file, 'dev': conf.validation_file, 'test': conf.test_file, 'relations': conf.relations_file}) File "C:\Users\kbrekelm\anaconda3\envs\knowledge\lib\site-packages\datasets\load.py", line 1751, in load_dataset use_auth_token=use_auth_token, File "C:\Users\kbrekelm\anaconda3\envs\knowledge\lib\site-packages\datasets\builder.py", line 705, in download_and_prepare dl_manager=dl_manager, verify_infos=verify_infos, **download_and_prepare_kwargs File "C:\Users\kbrekelm\anaconda3\envs\knowledge\lib\site-packages\datasets\builder.py", line 1227, in _download_and_prepare super()._download_and_prepare(dl_manager, verify_infos, check_duplicate_keys=verify_infos) File "C:\Users\kbrekelm\anaconda3\envs\knowledge\lib\site-packages\datasets\builder.py", line 793, in _download_and_prepare self._prepare_split(split_generator, **prepare_split_kwargs) File "C:\Users\kbrekelm\anaconda3\envs\knowledge\lib\site-packages\datasets\builder.py", line 1216, in _prepare_split desc=f"Generating {split_info.name} split", File "C:\Users\kbrekelm\anaconda3\envs\knowledge\lib\site-packages\tqdm\std.py", line 1195, in __iter__ for obj in iterable: File "C:\Users\kbrekelm\.cache\huggingface\modules\datasets_modules\datasets\rebel-short\cc2251b8498d028fbe2d96f927adc4acc8a079f4519818b9d7fb1ef5c693a910\rebel-short.py", line 103, in _generate_examples relations_df = pd.read_csv(self.config.data_files['relations'], header = None, sep='\t') File "C:\Users\kbrekelm\anaconda3\envs\knowledge\lib\site-packages\datasets\streaming.py", line 67, in wrapper return function(*args, use_auth_token=use_auth_token, **kwargs) ekelm\anaconda3\envs\knowledge\lib\site-packages\fsspec\core.py", line 213, in __getitem__ out = super().__getitem__(item) IndexError: list index out of range

Correction of the sample codes in README.md

Hello,
There are some mistakes in your code samples in the README.

  1. There is a mistake in the line of the REBEL example code below:
extracted_text = triplet_extractor.tokenizer.batch_decode(triplet_extractor("Punta Cana is a resort town in the municipality of Higuey, in La Altagracia Province, the eastern most province of the Dominican Republic", return_tensors=True, return_text=False)[0]["generated_token_ids"]["output_ids"])

The code line below gives : TypeError: 'list' object cannot be interpreted as an integer:
It should be like :

extracted_text = triplet_extractor.tokenizer.batch_decode(triplet_extractor("Punta Cana is a resort town in the municipality of Higuey, in La Altagracia Province, the eastern most province of the Dominican Republic", return_tensors=True, return_text=False)[0]["generated_token_ids"]["output_ids"][0])
  1. The second mistake is in the spaCy example:
    The line below gives the TypeError: add_pipe() got an unexpected keyword argument 'config'
nlp.add_pipe("rebel", after="senter", config={
    'device':0, # Number of the GPU, -1 if want to use CPU
    'model_name':'Babelscape/rebel-large'} # Model used, will default to 'Babelscape/rebel-large' if not given
    )

Could you correct your samples in the Readme?
Thanks in advance.
Bests,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.