Comments (9)
@foodiehack Can you please copy-paste the error message and tell me which model specifically (the link) that you are trying to load. Please note that the arXiv-PubMed links are currently empty since that model has not been trained yet.
from transformersum.
I am trying to run the inference using GUI as given in the doc for extractive summarisation , the model file used is https://drive.google.com/uc?id=1-W9VzvVgKyu4d3IfNMw0k2zvXzkqpRw7 , but getting this error
Traceback (most recent call last):
File "/gdrive/Kruthika1/longsum/transformersum/src/test.py", line 2, in
model = ExtractiveSummarizer.load_from_checkpoint("/gdrive/Kruthika1/longsum/transformersum/models/epoch=3.ckpt")
File "/gdrive/Kruthika1/virtualenvironment/Huggingface/lib/python3.6/site-packages/pytorch_lightning/core/saving.py", line 154, in load_from_checkpoint
model = cls._load_model_state(checkpoint, strict=strict, **kwargs)
File "/gdrive/Kruthika1/virtualenvironment/Huggingface/lib/python3.6/site-packages/pytorch_lightning/core/saving.py", line 200, in _load_model_state
model.load_state_dict(checkpoint['state_dict'], strict=strict)
File "/gdrive/Kruthika1/virtualenvironment/Huggingface/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1052, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for ExtractiveSummarizer:
Missing key(s) in state_dict: "word_embedding_model.embeddings.position_ids".
Next : I am also trying to run Abstractive summarisation as given in the doc . the model used is file https://drive.google.com/drive/folders/1DBxRZkOHS7OdU80L8OvnzCa3K6-ho_Dj ..and as said from @foodiehack there is no checkpint , i can see python.bin .
Please give suggest the model to run for Abstractive summarisation.
from transformersum.
@kruthikakr For the extractive error, what version of transformers
are you using. Try using v3.0.2 with pip install -U transformers==3.0.2
as discussed at #20 (comment).
The abstractive summarization folder that you linked contains three different models. The number at the end is the length of the input sequence that each model can accept.
from transformersum.
@HHousen Ya the link has 3 folders , in all the 3 folders there is no model checkpoint , i can see the .bin files. To run the longformer abstractive summarisation which model can be used .
from transformersum.
@HHousen we have been evaluating different Abstractive summarisation models , since the SOTA model latest is Pegasus , have you looked in to it ? with our experimentation for general text T5 and Bart are giving better results than Pegasus. Please give me your comments on this .
Adding to this , i see distilled models for Extractive , how to make them for abstractive or any reference if they are already there ?
from transformersum.
@HHousen Ya the link has 3 folders , in all the 3 folders there is no model checkpoint , i can see the .bin files. To run the longformer abstractive summarisation which model can be used .
@kruthikakr These are huggingface/transformers
models, so they need to be used with the --model_name_or_path
option for further training. Or you can load them directly in transformers
using LongformerEncoderDecoderForConditionalGeneration.from_pretrained()
.
@HHousen we have been evaluating different Abstractive summarisation models , since the SOTA model latest is Pegasus , have you looked in to it ? with our experimentation for general text T5 and Bart are giving better results than Pegasus. Please give me your comments on this .
Adding to this , i see distilled models for Extractive , how to make them for abstractive or any reference if they are already there ?
You need to use a seq2seq architecture for abstractive summarization. I would recommend distilbart, specifically sshleifer/distilbart-cnn-12-6. The performance of each model depends on the dataset you evaluate on and the dataset the model was trained on. For instance, a model trained to summarize news will not summarize a short story well. Other than that, I'm not sure why BART and T5 outperform PEGASUS.
from transformersum.
Thank you very much for the reply . I am referencing to https://transformersum.readthedocs.io/en/latest/abstractive/models-results.html#bart-converted-to-longformerencdec where the models have no checkpoint files in your library. As i don't want to put for training. I am just trying similar to extractive summarisation models using GUI (python predictions_website.py) , How to load the models for abstractive summarisation.
Sorry ,i am new to transformers , can you please provide details for LongformerEncoderDecoderForConditionalGeneration.from_pretrained().
I will verify. Thank you
from transformersum.
There are currently no pre-trained models that can be used to abstractively summarize long documents. Models listed in the BART Converted to LongformerEncoderDecoder section need to be fine-tuned on a long document summarization dataset, such as Arxiv-PubMed, to create a model that can summarize long sequences. The ArXiv-PubMed models will be trained as soon as I obtain the resources necessary to train them (2 Tesla V100 GPUs).
I've updated the documentation to reflect this.
from transformersum.
Further discussion of this issue will be moved to #38.
from transformersum.
Related Issues (20)
- TypeError: __init__() got an unexpected keyword argument 'gradient_checkpointing' HOT 1
- predictions_website.py raises AttributeError: '_LazyAutoMapping' object has no attribute '_mapping' HOT 6
- ModuleNotFoundError: No module named 'extractive' HOT 1
- AttributeError: '_LazyAutoMapping' object has no attribute '_mapping' HOT 1
- Abstractive BART Model , RuntimeError: The size of tensor a (64000) must match the size of tensor b (64001) at non-singleton dimension 1
- ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on. HOT 3
- error when training an extractive summarization model HOT 2
- Found keys that are in the model state dict but not in the checkpoint HOT 3
- Suggest about the index order of extractive results
- A Chinese solution for TransformerSum-extractive, and I've implemented your work in my project HOT 1
- After extractive training, a process on one GPU won't terminate automatically.
- Fine-tuning/Inference commands for "roberta-base-ext-sum"
- '--data_type' is not accepted when running main.py (extractive mode)
- Why tokenize twice?
- TypeError: forward() got an unexpected keyword argument 'source'
- Instruction for fine tune
- Installation via Pip
- Some versioning problems when installing the environment HOT 2
- pytorch_lightning.callbacks update HOT 1
- RoBERTa & Longformer extractive model checkpoints availability
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformersum.