Comments (10)
it works , thanks !!
from transformersum.
I believe that you have an old version of TransformerSum. Try using git pull
to update to the latest version. This bug is already fixed in abstractive.py
: https://github.com/HHousen/TransformerSum/blob/master/src/abstractive.py#L674.
from transformersum.
Thanks, now it looks like I have another problem with the same line during the validation sanity check :
Traceback (most recent call last):
File "main.py", line 490, in <module>
main(main_args)
File "main.py", line 125, in main
trainer.fit(model)
File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 553, in fit
self._run(model)
File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 918, in _run
self._dispatch()
File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 986, in _dispatch
self.accelerator.start_training(self)
File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/accelerators/accelerator.py", line 92, in start_training
self.training_type_plugin.start_training(trainer)
File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 161, in start_training
self._results = trainer.run_stage()
File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 996, in run_stage
return self._run_train()
File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 1031, in _run_train
self._run_sanity_check(self.lightning_module)
File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 1115, in _run_sanity_check
self._evaluation_loop.run()
File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/loops/base.py", line 111, in run
self.advance(*args, **kwargs)
File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 111, in advance
dataloader_iter, self.current_dataloader_idx, dl_max_batches, self.num_dataloaders
File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/loops/base.py", line 111, in run
self.advance(*args, **kwargs)
File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 110, in advance
output = self.evaluation_step(batch, batch_idx, dataloader_idx)
File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 154, in evaluation_step
output = self.trainer.accelerator.validation_step(step_kwargs)
File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/accelerators/accelerator.py", line 211, in validation_step
return self.training_type_plugin.validation_step(*step_kwargs.values())
File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 178, in validation_step
return self.model.validation_step(*args, **kwargs)
File "/root/project/test/TransformerSum/src/abstractive.py", line 709, in validation_step
cross_entropy_loss = self._step(batch)
File "/root/project/test/TransformerSum/src/abstractive.py", line 694, in _step
outputs = self.forward(source, target, source_mask, target_mask, labels=labels)
File "/root/project/test/TransformerSum/src/abstractive.py", line 256, in forward
loss = self.calculate_loss(prediction_scores, labels)
File "/root/project/test/TransformerSum/src/abstractive.py", line 674, in calculate_loss
prediction_scores.view(-1, self.model.config.vocab_size), labels.view(-1)
File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/project/test/TransformerSum/src/helpers.py", line 282, in forward
return F.kl_div(output, model_prob, reduction="batchmean")
File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/torch/nn/functional.py", line 2622, in kl_div
reduced = torch.kl_div(input, target, reduction_enum, log_target=log_target)
RuntimeError: The size of tensor a (32100) must match the size of tensor b (32128) at non-singleton dimension 1
from transformersum.
i have the same problem
from transformersum.
Hello @JoachimJaafar and @azouiaymen. This issue is difficult to debug since there are a lot of possible causes. However, I may have a solution. Try changing that line (TransformerSum/src/abstractive.py line 674) to this: prediction_scores.view(-1, prediction_scores.size(-1)), labels.view(-1)
. If this works for you, I'll merge the change into the master branch.
For reference, this line is commonly used:
- https://github.com/huggingface/transformers/blob/424419f54964a5ca68277e700a8264f701968639/examples/legacy/seq2seq/seq2seq_trainer.py#L167
- https://github.com/huggingface/transformers/blob/83e5a10603ca902c266e40fc98a01dd8a9b04ac4/src/transformers/models/t5/modeling_t5.py#L1642
from transformersum.
I unfortunately still get the same error, what about you @azouiaymen ?
from transformersum.
I tried it, but still don't work. starting to lose faith @JoachimJaafar
from transformersum.
Hello. It's possible that the issue is with the LabelSmoothingLoss class that I copied from OpenNMT. Can you try setting --label_smoothing 0
in your command to try to fix the issue? Thanks.
from transformersum.
I was able to run 50 steps with the t5-base
model when --label_smoothing
was set to 0
. For some reason the LabelSmoothingLoss is failing so TransformerSum needs a more robust implementation.
from transformersum.
I can confirm, that was indeed the problem. Thanks !
from transformersum.
Related Issues (20)
- TypeError: __init__() got an unexpected keyword argument 'gradient_checkpointing' HOT 1
- predictions_website.py raises AttributeError: '_LazyAutoMapping' object has no attribute '_mapping' HOT 6
- ModuleNotFoundError: No module named 'extractive' HOT 1
- AttributeError: '_LazyAutoMapping' object has no attribute '_mapping' HOT 1
- Abstractive BART Model , RuntimeError: The size of tensor a (64000) must match the size of tensor b (64001) at non-singleton dimension 1
- ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on. HOT 3
- error when training an extractive summarization model HOT 2
- Found keys that are in the model state dict but not in the checkpoint HOT 3
- Suggest about the index order of extractive results
- A Chinese solution for TransformerSum-extractive, and I've implemented your work in my project HOT 1
- After extractive training, a process on one GPU won't terminate automatically.
- Fine-tuning/Inference commands for "roberta-base-ext-sum"
- '--data_type' is not accepted when running main.py (extractive mode)
- Why tokenize twice?
- TypeError: forward() got an unexpected keyword argument 'source'
- Instruction for fine tune
- Installation via Pip
- Some versioning problems when installing the environment HOT 2
- pytorch_lightning.callbacks update HOT 1
- RoBERTa & Longformer extractive model checkpoints availability
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformersum.