hhousen / transformersum Goto Github PK

Models to perform neural summarization (extractive and abstractive) using machine learning transformers and a tool to convert abstractive summarization datasets to the extractive task.

Home Page: https://transformersum.rtfd.io

License: GNU General Public License v3.0

Python 100.00%

automatic-summarization machine-learning extractive-summarization bert roberta distilbert albert summarization transformer-models pytorch-lightning

transformersum's People

Contributors

Stargazers

Watchers

Forkers

bobycv06fpm souravroych salman1993 yyht dw-park gdh756462786 tanmayag78 robertkibet lunnada dmitriuso erick093 khubaibahmed-dev pranaya-mathur sentian rahulpa38 makrai suraj520 knaini ngo010 eulerian-tuple kaanefekeles boy-be-ambitious dumpmemory mohammedgomaa techthiyanes acclift mishav78 jpilaul phongtnit taiypeo aktsvigun swati1-ud peppazh saeub griff4692 pankajminda jaedukseo harshtomarcode laleye vineethbabu nurievsiroj rajhans ina299 jus1mple vinace prathameshk vic4code v-mk-s lhemamou narasimmansaravana1994 walbermr iremddemir vuthanh2611 yesuki gcosma nanditho paulasquin daoyuly

transformersum's Issues

Installation via Pip

Is it possible to install this via pip??

I am trying to run extractive summarization on a collab notebook, which doesn't really support conda.

Possible to do sub-sentence level extractive summarization?

After reading the documentation, it looks like the Extractive Summarization components only score sentences. While this is how the vast majority of extractive summarization papers work, some extractive summarization systems and datasets work at the word level of granularity (namely, my own work is exclusively word-level extractive summarization)

Is there some way to make TransformerSum work at the word level of granularity out of the box? When I trained extractive word-level models, I used a final token classification head for it. Maybe that can be implemented here alongside the current sentence scoring heads?

predictions_website.py raises AttributeError: '_LazyAutoMapping' object has no attribute '_mapping'

After creating the transformersum conda env, I tried running python predictions_website.py and it triggers an error:

Traceback (most recent call last):
  File "predictions_website.py", line 8, in <module>
    from extractive import ExtractiveSummarizer  # noqa: E402
  File "/home/msi1/dev_home/transformersum/src/extractive.py", line 39, in <module>
    MODEL_CLASSES = tuple(m.model_type for m in MODEL_MAPPING)  # + CUSTOM_MODEL_CLASSES
  File "/home/msi1/anaconda3/envs/transformersum/lib/python3.7/site-packages/transformers/models/auto/auto_factory.py", line 528, in __iter__
    return iter(self._mapping.keys())
AttributeError: '_LazyAutoMapping' object has no attribute '_mapping'

AttributeError: can't set attribute when using pre-trained extractive summariser

Hi,

Great work on the repo. I followed the Getting Started page and tried to run mobilebert-uncased-ext-sum model. Here is a simple code snippet I used:

import os
import sys

sys.path.insert(0, os.path.abspath("./src"))
from extractive import ExtractiveSummarizer

model = ExtractiveSummarizer.load_from_checkpoint("mobilebert-uncased-ext-sum.ckpt")

text_to_summarize = "my long text."
 
model.predict(text_to_summarize)

However, I get the following traceback

Traceback (most recent call last):
  File "/Users/blazejmanczak/Desktop/Projects/Artemos/ext_summarization/transformersum/testing_extractive.py", line 8, in <module>
    model = ExtractiveSummarizer.load_from_checkpoint("mobilebert-uncased-ext-sum.ckpt")
  File "/opt/miniconda3/envs/transformersum/lib/python3.9/site-packages/pytorch_lightning/core/saving.py", line 157, in load_from_checkpoint
    model = cls._load_model_state(checkpoint, strict=strict, **kwargs)
  File "/opt/miniconda3/envs/transformersum/lib/python3.9/site-packages/pytorch_lightning/core/saving.py", line 199, in _load_model_state
    model = cls(**_cls_kwargs)
  File "/Users/blazejmanczak/Desktop/Projects/Artemos/ext_summarization/transformersum/src/extractive.py", line 109, in __init__
    self.hparams = hparams
  File "/opt/miniconda3/envs/transformersum/lib/python3.9/site-packages/torch/nn/modules/module.py", line 995, in __setattr__
    object.__setattr__(self, name, value)
AttributeError: can't set attribute

Any tips on how to solve that?

A Chinese solution for TransformerSum-extractive, and I've implemented your work in my project

This is part of my repo in which I've imported your project: https://github.com/PolarisRisingWar/text_summarization_chinese/tree/master/models/transformersum/extractive
My project is aiming to provide a general solution on Chinese text data on a bunch of important and famous text summarization model or package, and I've already wriiten codes for extractive part of TransformerSum to act on Chinese text data. If there are problems about citation and other requirements, please tell me.
I've only written Chinese version document, if needed, I can write another English version.

Cannot load Abstractive Longformer Pre-Trained Model

I am trying to load Pre Trained Abstractive model as written in the docs but it's giving an error. The model uploaded is not a checkpoint

Cannot load any Pre-Trained Model

I created an extractive summarizer using the "extractive" module and the parser(almost all default values).
from extractive import ExtractiveSummarizer
summarizer = ExtractiveSummarizer
args = build_parser()
model = summarizer(hparams=args)
checkpoint = torch.load( "epoch=3.ckpt", map_location=lambda storage, loc: storage )
model.load_state_dict(checkpoint["state_dict"])
When I try to load a state dict in a model, it raises an error, "RuntimeError: Error(s) in loading state_dict for ExtractiveSummarizer:
Missing key(s) in state_dict: "word_embedding_model.embeddings.position_ids"."

Sometimes with the missing keys error, it produces IncompatibleKeys and shape mismatch error.

Note: The model checkpoint file is of the same model name provided in the parser.
for build_parser, I just extracted all the arguments from the main.py file and placed it in a function.

It would be nice if you can document the necessary steps to load a pretrained model in the official documentation. As of now, I can't find any.
Thanks!

can't find saved weights after training

my model finished training but I can't seem to find the saved weights after it's done. I created a directory ./trained_models, but nothing is there. Here is my command:
python src/main.py --model_name_or_path allenai/longformer-base-4096 --model_type longformer --data_path ./datasets/cnn_dm_extractive_compressed_small/ --weights_save_path ./trained_models --do_train --max_epochs 1 --no_use_token_type_ids --max_seq_length 2048 --batch_size 4 --log WARNING

longformer training failing at validation step

@HHousen - I left the model training last night and it failed near the very end of the epoch. It looks like it's in the validation_step, even though it passed the validation check at the beginning. Here is the full output:

Epoch 0:  96%|█████████▌| 71750/75115 [7:41:46<21:39,  2.59it/s, loss=nan, v_num=rhcx, train_loss_total=nan, train_loss_total_norm_batch=nan, train_loss_avg_seq_sum=nan, train_loss_avg_seq_mean=nan, train
_loss_avg=nan]                Traceback (most recent call last):
  File "src/main.py", line 393, in <module>
    main(main_args)
  File "src/main.py", line 97, in main
    trainer.fit(model)
  File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 439, in fit
    results = self.accelerator_backend.train()
  File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/pytorch_lightning/accelerators/gpu_accelerator.py", line 54, in train
    results = self.train_or_test()
  File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/pytorch_lightning/accelerators/accelerator.py", line 66, in train_or_test
    results = self.trainer.train()
  File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 482, in train
    self.train_loop.run_training_epoch()
  File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 569, in run_training_epoch
    self.trainer.run_evaluation(test_mode=False)
  File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 567, in run_evaluation
    output = self.evaluation_loop.evaluation_step(test_mode, batch, batch_idx, dataloader_idx)
  File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 171, in evaluation_step
    output = self.trainer.accelerator_backend.validation_step(args)
  File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/pytorch_lightning/accelerators/gpu_accelerator.py", line 76, in validation_step
    output = self.__validation_step(args)
  File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/pytorch_lightning/accelerators/gpu_accelerator.py", line 86, in __validation_step
    output = self.trainer.model.validation_step(*args)
  File "/home/jupyter/TransformerSum/src/extractive.py", line 688, in validation_step
    y_hat.detach().cpu().numpy(), y_true.float().detach().cpu().numpy()  
  File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/transformers/data/metrics/__init__.py", line 37, in acc_and_f1
    f1 = f1_score(y_true=labels, y_pred=preds)
  File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/sklearn/utils/validation.py", line 72, in inner_f
    return f(**kwargs)
  File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/sklearn/metrics/_classification.py", line 1047, in f1_score
    zero_division=zero_division)
  File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/sklearn/utils/validation.py", line 72, in inner_f
    return f(**kwargs)
  File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/sklearn/metrics/_classification.py", line 1175, in fbeta_score
    zero_division=zero_division)
  File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/sklearn/utils/validation.py", line 72, in inner_f
    return f(**kwargs)
  File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/sklearn/metrics/_classification.py", line 1434, in precision_recall_fscore_support
    pos_label)
  File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/sklearn/metrics/_classification.py", line 1250, in _check_set_wise_labels
    y_type, y_true, y_pred = _check_targets(y_true, y_pred)
  File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/sklearn/metrics/_classification.py", line 83, in _check_targets
    type_pred = type_of_target(y_pred)
  File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/sklearn/utils/multiclass.py", line 287, in type_of_target
    _assert_all_finite(y)
  File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/sklearn/utils/validation.py", line 99, in _assert_all_finite
    msg_dtype if msg_dtype is not None else X.dtype)
ValueError: Input contains NaN, infinity or a value too large for dtype('float32').

BCELoss is not safe with fp16

I'm trying to train a longformer model for extractive summarization. It's running out of memory for a V100 GPU, so I tried to use fp16 as allenai suggested. So far, this is the command I'm using:
python src/main.py --model_name_or_path allenai/longformer-base-4096 --model_type longformer --data_path ./datasets/cnn_dm_extractive_small --weights_save_path ./trained_models --do_train --max_steps 50000 --use_logger tensorboard --no_use_token_type_ids --auto_scale_batch_size binsearch --amp_level 02 --precision 16

However, I'm getting the following error:
RuntimeError: torch.nn.functional.binary_cross_entropy and torch.nn.BCELoss are unsafe to autocast. Many models use a sigmoid layer right before the binary cross entropy layer. In this case, combine the two layers using torch.nn.functional.binary_cross_entropy_with_logits or torch.nn.BCEWithLogitsLoss. binary_cross_entropy_with_logits and BCEWithLogits are safe to autocast.

But in your comments in the file extractive.py you mentioned that you chose not to use torch.nn.BCEWithLogitsLoss, so can you please suggest how to run with fp16?

Thank you.

Summarizing a long document

I was trying to follow the instructions in https://transformersum.readthedocs.io/en/latest/general/getting-started.html but they don't make sense. The Drive link contains .bin files while the "model = AbstractiveSummarizer.load_from_checkpoint("path/to/ckpt/file")
" needs ckpt files. I have tried to use "LongformerEncoderDecoderForConditionalGeneration.from_pretrained()" but I can't use the model to create summaries. All I want is to test a pre-trained model on a long document.Can you please guide me on how to do so?

Cannot load a pre-trained model

I know this issue has been closed many times before but it seems like it doesn't work again. I cloned the repo and installed conda env as docs said. That installed transformers=4.8.2. Downgrading it to 3.0.2 (like it was suggested a year ago) isn't probably the best thing to do.
With transformers 4.8.2 I get:

RuntimeError: Error(s) in loading state_dict for ExtractiveSummarizer:
	Missing key(s) in state_dict: "word_embedding_model.embeddings.position_ids".

When I do:

model = ExtractiveSummarizer.load_from_checkpoint("../distilroberta-base-ext-sum.ckpt")

I tried in Colab and locally with jupyter notebook.

"TypeError: forward() got an unexpected keyword argument 'use_cache'" for AbstractiveSummarizer

Hi,

I'm trying to run the following command (from https://transformersum.readthedocs.io/en/latest/abstractive/training.html, I just added --gpus 0) :

python main.py --mode abstractive --model_name_or_path bert-base-uncased --decoder_model_name_or_path bert-base-uncased --cache_file_path data --max_epochs 4 --do_train --do_test --batch_size 4 --weights_save_path model_weights --no_wandb_logger_log_model --accumulate_grad_batches 5 --use_scheduler linear --warmup_steps 8000 --gradient_clip_val 1.0 --custom_checkpoint_every_n 300 --gpus 0

Here's is my environment.yml :

name: transformersum
channels:
    - conda-forge
    - pytorch
dependencies:
    - pytorch
    - scikit-learn
    - tensorboard
    - spacy
    - spacy-model-en_core_web_sm
    - sphinx
    - pyarrow
    - pip
    - pip:
      - pytorch_lightning
      - transformers
      - torch_optimizer
      - click==7.0
      - wandb
      - rouge-score
      - packaging
      - datasets
      - gradio
      - tokenizers==0.8.0rc4
variables:
    TOKENIZERS_PARALLELISM: true

When performing the validation sanity check, I get the following exception :

Traceback (most recent call last):
  File "main.py", line 457, in <module>
    main(main_args)
  File "main.py", line 119, in main
    trainer.fit(model)
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 458, in fit
    self._run(model)
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 756, in _run
    self.dispatch()
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 797, in dispatch
    self.accelerator.start_training(self)
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/accelerators/accelerator.py", line 96, in start_training
    self.training_type_plugin.start_training(trainer)
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 144, in start_training
    self._results = trainer.run_stage()
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 807, in run_stage
    return self.run_train()
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 842, in run_train
    self.run_sanity_check(self.lightning_module)
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 1107, in run_sanity_check
    self.run_evaluation()
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 962, in run_evaluation
    output = self.evaluation_loop.evaluation_step(batch, batch_idx, dataloader_idx)
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 174, in evaluation_step
    output = self.trainer.accelerator.validation_step(args)
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/accelerators/accelerator.py", line 226, in validation_step
    return self.training_type_plugin.validation_step(*args)
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 161, in validation_step
    return self.lightning_module.validation_step(*args, **kwargs)
  File "/root/project/ia-etat-de-l-art/mlflow/scripts/TransformerSum/src/abstractive.py", line 701, in validation_step
    cross_entropy_loss = self._step(batch)
  File "/root/project/ia-etat-de-l-art/mlflow/scripts/TransformerSum/src/abstractive.py", line 686, in _step
    outputs = self.forward(source, target, source_mask, target_mask, labels=labels)
  File "/root/project/ia-etat-de-l-art/mlflow/scripts/TransformerSum/src/abstractive.py", line 248, in forward
    **kwargs
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/transformers/modeling_encoder_decoder.py", line 276, in forward
    **kwargs_encoder,
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
TypeError: forward() got an unexpected keyword argument 'use_cache'

I'm currently using Python 3.6.13.
Do you have an idea on what I'm doing wrong or need to change in any of my files ? Thanks !

Problem in cloning repository mentioned in documentation to train an abstractive summarization model

Hi,
I can't install
pip install git+https://github.com/HHousen/transformers.git@longformer_encoder_decoder
and
pip install git+https://github.com/allenai/longformer.git@encoderdecoder.
It's saying repository doesn't exist.
Can you please correct the path of github directories?

After extractive training, a process on one GPU won't terminate automatically.

I've found this process was launched by this command: python -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=24, pipe_handle=317) --multiprocessing-fork
The process of extractive training is over. I got the checkpoint which name has tmp_end. But this strange process still occupied one of my gpus and continued running without outputs.
I have to kill it manually. I don't know what caused this problem?

Suggest about the index order of extractive results

I suggest to sort the index order before print the results of extractive models.
Before the line 1155 of https://github.com/HHousen/TransformerSum/blob/master/src/extractive.py (for i in selected_ids:), I suggest to add a sort function: selected_ids.sort()
In this way, we can print the results in their original order.

AttributeError: [MODEL_CONFIG] object has no attribute 'encoder'

I'm trying to run this command : python main.py --mode abstractive --model_name_or_path t5-base --cache_file_path data --max_epochs 4 --do_train --do_test --batch_size 1 --weights_save_path model_weights --no_wandb_logger_log_model --accumulate_grad_batches 5 --use_scheduler linear --warmup_steps 8000 --gradient_clip_val 1.0 --custom_checkpoint_every_n 10000 --gpus 1 --dataset ../data_chunk.*

My env :

name: transformersum
channels:
    - conda-forge
    - pytorch
dependencies:
    - pytorch
    - scikit-learn
    - tensorboard
    - spacy
    - spacy-model-en_core_web_sm
    - sphinx
    - pyarrow=4
    - pip
    - pip:
      - pytorch_lightning
      - transformers==4.8.0
      - torch_optimizer
      - click==7.0
      - wandb
      - rouge-score
      - packaging
      - datasets
      - gdown
      - gradio
      - torch==1.8.1+cu111
      - torchvision==0.9.1+cu111
      - torchaudio==0.8.1
      - -f https://download.pytorch.org/whl/torch_stable.html
variables:
    TOKENIZERS_PARALLELISM: true
    LC_ALL: C.UTF-8
    LANG: C.UTF-8

Whenever I'm running the command with --model_name_or_path different than bert-base-uncased (for instance t5-base), I'm getting this error :

  File "main.py", line 457, in <module>
    main(main_args)
  File "main.py", line 119, in main
    trainer.fit(model)
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 553, in fit
    self._run(model)
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 918, in _run
    self._dispatch()
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 986, in _dispatch
    self.accelerator.start_training(self)
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/accelerators/accelerator.py", line 92, in start_training
    self.training_type_plugin.start_training(trainer)
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 161, in start_training
    self._results = trainer.run_stage()
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 996, in run_stage
    return self._run_train()
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 1031, in _run_train
    self._run_sanity_check(self.lightning_module)
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 1115, in _run_sanity_check
    self._evaluation_loop.run()
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/loops/base.py", line 111, in run
    self.advance(*args, **kwargs)
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 111, in advance
    dataloader_iter, self.current_dataloader_idx, dl_max_batches, self.num_dataloaders
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/loops/base.py", line 111, in run
    self.advance(*args, **kwargs)
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 110, in advance
    output = self.evaluation_step(batch, batch_idx, dataloader_idx)
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 154, in evaluation_step
    output = self.trainer.accelerator.validation_step(step_kwargs)
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/accelerators/accelerator.py", line 211, in validation_step
    return self.training_type_plugin.validation_step(*step_kwargs.values())
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 178, in validation_step
    return self.model.validation_step(*args, **kwargs)
  File "/root/project/test/TransformerSum/src/abstractive.py", line 704, in validation_step
    cross_entropy_loss = self._step(batch)
  File "/root/project/test/TransformerSum/src/abstractive.py", line 689, in _step
    outputs = self.forward(source, target, source_mask, target_mask, labels=labels)
  File "/root/project/test/TransformerSum/src/abstractive.py", line 254, in forward
    loss = self.calculate_loss(prediction_scores, labels)
  File "/root/project/test/TransformerSum/src/abstractive.py", line 669, in calculate_loss
    prediction_scores.view(-1, self.model.config.encoder.vocab_size), labels.view(-1)
AttributeError: 'T5Config' object has no attribute 'encoder'

Import Abstractive is ambiguous

Which module does from abstractive import AbstractiveSummarizer refer to? I'd like to follow along with the tutorial, thanks!

Some versioning problems when installing the environment

Thanks for this project @HHousen and the great docs!

I've been playing around with Getting Started and encountered a couple of errors, one of which was addressed here (the strict=False fix).

Something which I haven't found anywhere yet is the following stacktrace, which after digging through Pytorch Lightning (PL) releases appears to be an error due to a refactor at version 1.7.0.

Traceback (most recent call last):
  File "/.../TransformerSum/predictions_website.py", line 8, in <module>
    from extractive import ExtractiveSummarizer  # noqa: E402
  File "/.../TransformerSum/src/extractive.py", line 27, in <module>
    from data import FSDataset, FSIterableDataset, SentencesProcessor, pad_batch_collate
  File "/.../TransformerSum/src/data.py", line 14, in <module>
    from helpers import pad
  File "/.../TransformerSum/src/helpers.py", line 51, in <module>
    class StepCheckpointCallback(pl.callbacks.base.Callback):
AttributeError: module 'pytorch_lightning.callbacks' has no attribute 'base'

Since this is a breaking change in PL and TransformerSum is somewhat old I feel the best fix is just to downgrade pytorch-lightning as follows:

pip install pytorch-lightning==1.6.5

I also needed to specify python==3.10 in the environment definition since PL doesn't work under 3.11 yet. Perhaps other users would be helped by specifying these versions within the environment.yml file?

unable to run pretrained model

facing issue with every pre-trained model.

[2020-10-05 22:36:49,902] ERROR in app: Exception on /api/predict/ [POST]
Traceback (most recent call last):
  File "/media/amiya/hdd/miniconda/envs/transformersum/lib/python3.8/site-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/media/amiya/hdd/miniconda/envs/transformersum/lib/python3.8/site-packages/flask/app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/media/amiya/hdd/miniconda/envs/transformersum/lib/python3.8/site-packages/flask/app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/media/amiya/hdd/miniconda/envs/transformersum/lib/python3.8/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/media/amiya/hdd/miniconda/envs/transformersum/lib/python3.8/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/media/amiya/hdd/miniconda/envs/transformersum/lib/python3.8/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/media/amiya/hdd/miniconda/envs/transformersum/lib/python3.8/site-packages/gradio/networking.py", line 109, in predict
    prediction, durations = app.interface.process(raw_input)
  File "/media/amiya/hdd/miniconda/envs/transformersum/lib/python3.8/site-packages/gradio/interface.py", line 254, in process
    predictions, durations = self.run_prediction(processed_input, return_duration=True)
  File "/media/amiya/hdd/miniconda/envs/transformersum/lib/python3.8/site-packages/gradio/interface.py", line 216, in run_prediction
    prediction = predict_fn(*processed_input)
  File "predictions_website.py", line 11, in summarize_text
    summarizer = ExtractiveSummarizer.load_from_checkpoint(model_choice)
  File "/media/amiya/hdd/miniconda/envs/transformersum/lib/python3.8/site-packages/pytorch_lightning/core/saving.py", line 153, in load_from_checkpoint
    model = cls._load_model_state(checkpoint, *args, strict=strict, **kwargs)
  File "/media/amiya/hdd/miniconda/envs/transformersum/lib/python3.8/site-packages/pytorch_lightning/core/saving.py", line 192, in _load_model_state
    model.load_state_dict(checkpoint['state_dict'], strict=strict)
  File "/media/amiya/hdd/miniconda/envs/transformersum/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1044, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for ExtractiveSummarizer:
	Missing key(s) in state_dict: "word_embedding_model.embeddings.position_ids".

Support Positional Encoding ?

Is this project support positional encoding like sine and cosine function from 'attention is all you need' original paper?

error when training with multiple GPUs : AttributeError: Can't pickle local object 'ExtractiveSummarizer.prepare_data.<locals>.longformer_modifier'

Hi @HHousen -- we have talked in a previous issue -- the good news is that I actually got the longformer training working! But now I'm trying to speed up training by using multiple GPUs. However, I get the following error with muliple GPUs while it is working fine with just 1 GPU:
Traceback (most recent call last):
File "src/main.py", line 393, in
main(main_args)
File "src/main.py", line 97, in main
trainer.fit(model)
File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 439, in fit
results = self.accelerator_backend.train()
File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/pytorch_lightning/accelerators/ddp_spawn_accelerator.py", line 65, in train
mp.spawn(self.ddp_train, nprocs=self.nprocs, args=(self.mp_queue, model,))
File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 200, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 149, in start_processes
process.start()
File "/opt/conda/lib/python3.7/multiprocessing/process.py", line 112, in start
self._popen = self._Popen(self)
File "/opt/conda/lib/python3.7/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/opt/conda/lib/python3.7/multiprocessing/popen_spawn_posix.py", line 32, in init
super().init(process_obj)
File "/opt/conda/lib/python3.7/multiprocessing/popen_fork.py", line 20, in init
self._launch(process_obj)
File "/opt/conda/lib/python3.7/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/opt/conda/lib/python3.7/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'ExtractiveSummarizer.prepare_data..longformer_modifier'

Training on multi-gpus

Hi,

Thank you for your great work.
I'm using the training script in the documentation as below:

python main.py \
--model_name_or_path bert-base-uncased \
--model_type bert \
--data_path ./bert-base-uncased \
--max_epochs 3 \
--accumulate_grad_batches 2 \
--warmup_steps 2300 \
--gradient_clip_val 1.0 \
--optimizer_type adamw \
--use_scheduler linear \
--do_train --do_test \
--batch_size 16

It works well when using single gpu. However, when I use multi-gpus (export CUDA_VISIBLE_DEVICES=0,1), below error occurs.

-- Process 0 terminated with the following error:
Traceback (most recent call last):
  File "/home/bering/anaconda3/envs/torch1.6/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 20, in _wrap
    fn(i, *args)
  File "/home/bering/anaconda3/envs/torch1.6/lib/python3.6/site-packages/pytorch_lightning/plugins/training_type/ddp_spawn.py", line 172, in new_process
    results = trainer.run_stage()
  File "/home/bering/anaconda3/envs/torch1.6/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 807, in run_stage
    return self.run_train()
  File "/home/bering/anaconda3/envs/torch1.6/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 842, in run_train
    self.run_sanity_check(self.lightning_module)
  File "/home/bering/anaconda3/envs/torch1.6/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 1107, in run_sanity_check
    self.run_evaluation()
  File "/home/bering/anaconda3/envs/torch1.6/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 988, in run_evaluation
    self.evaluation_loop.evaluation_epoch_end(outputs)
  File "/home/bering/anaconda3/envs/torch1.6/lib/python3.6/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 213, in evaluation_epoch_end
    model.validation_epoch_end(outputs)
  File "/home/bering/git/TransformerSum/src/extractive.py", line 841, in validation_epoch_end
    self.log(name, value, prog_bar=True, sync_dist=True)
  File "/home/bering/anaconda3/envs/torch1.6/lib/python3.6/site-packages/pytorch_lightning/core/lightning.py", line 345, in log
    self.device,
  File "/home/bering/anaconda3/envs/torch1.6/lib/python3.6/site-packages/pytorch_lightning/core/step_result.py", line 116, in log
    value = sync_fn(value, group=sync_dist_group, reduce_op=sync_dist_op)
  File "/home/bering/anaconda3/envs/torch1.6/lib/python3.6/site-packages/pytorch_lightning/plugins/training_type/ddp_spawn.py", line 301, in reduce
    tensor = sync_ddp_if_available(tensor, group, reduce_op=(reduce_op or "mean"))
  File "/home/bering/anaconda3/envs/torch1.6/lib/python3.6/site-packages/pytorch_lightning/utilities/distributed.py", line 137, in sync_ddp_if_available
    return sync_ddp(result, group=group, reduce_op=reduce_op)
  File "/home/bering/anaconda3/envs/torch1.6/lib/python3.6/site-packages/pytorch_lightning/utilities/distributed.py", line 170, in sync_ddp
    torch.distributed.all_reduce(result, op=op, group=group, async_op=False)
  File "/home/bering/anaconda3/envs/torch1.6/lib/python3.6/site-packages/torch/distributed/distributed_c10d.py", line 936, in all_reduce
    work = _default_pg.allreduce([tensor], opts)
RuntimeError: Tensors must be CUDA and dense

Can I use the training script in multi-gpu setting?
Thank you.

how to load and infer with trained model?

I have trained a extractive longformer model, and have 3 epoch checkpoints saved -- now I want to try making inferences with it, but I'm getting errors when I run the main.py script with do_test:

python src/main.py --model_name_or_path /home/jupyter/TransformerSum/trained_models/epoch=2.ckpt --model_type longformer --data_path ./datasets/cnn_dm_extractive_compressed_5000/ --weights_save_path ./trained_models --do_test --no_use_token_type_ids --max_seq_length 2048 --batch_size 4 --log WARNING --use_logger tensorboard --weights_save_path /home/jupyter/TransformerSum/trained_models --use_custom_checkpoint_callback

output:

  File "src/main.py", line 397, in <module>
    main(main_args)
  File "src/main.py", line 56, in main
    model = summarizer(hparams=args)
  File "/home/jupyter/TransformerSum/src/extractive.py", line 112, in __init__
    gradient_checkpointing=hparams.gradient_checkpointing,
  File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/transformers/configuration_auto.py", line 203, in from_pretrained
    config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/transformers/configuration_utils.py", line 243, in get_config_dict
    config_dict = cls._dict_from_json_file(resolved_config_file)
  File "/home/jupyter/TransformerSum/env/lib/python3.7/site-packages/transformers/configuration_utils.py", line 325, in _dict_from_json_file
    text = reader.read()
  File "/opt/conda/lib/python3.7/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte```

Support `num_workers` for the extractive `DataLoader` (use `Dataset` instead of `IterableDataset`)

The --dataloader_num_workers argument is only supported for abstractive summarization. The reason you cannot change this option for extractive summarization is because the extractive DataLoaders are created from torch.utils.data.IterableDatasets. IterableDatasets replicate the same dataset object on each worker process. Thus, the replicas must be configured differently to avoid duplicated data. See the PyTorch documentation description of iterable style datasets and the IterableDataset docstring for more information.

The docstring gives two examples of how to split an IterableDataset workload across all workers. However, I have not implemented this into the library. Ideally, I would simply use a normal Dataset but I'm not certain how to use this properly since the entire dataset cannot be loaded into memory at once. I potentially could use Apache Arrow.

I was looking at how the huggingface/transformers seq2seq example deals with this problem. They use the Dataset class instead of IterableDataset by using the built-in python module linecache, which I have not heard of before. Implementing this will be a significant refactoring of the library's extractive data loading code.

'--data_type' is not accepted when running main.py (extractive mode)

Hi! I am trying to run training for an extractive model, yet main.py keeps throwing this error: "extractive|ERROR> Data is going to be processed, but you have not specified an output format. Set --data_type to the desired format."
I am passing --data_type='txt' as an argument.
The data I'm using is a custom dataset pre-processed with convert_to_extractive.py script, it is in .json files as the script outputs.
Thanks in advance! :)

Abstractive summarization model example not working

Hello,

https://transformersum.readthedocs.io/en/latest/abstractive/training.html#example does not seem to work.

We get the following error after installing the environment with conda env create --file environment.yml:

Traceback (most recent call last):
  File "/home/ec2-user/transformersum/src/main.py", line 434, in <module>
    main(main_args)
  File "/home/ec2-user/transformersum/src/main.py", line 74, in main
    model = summarizer(hparams=args)
  File "/home/ec2-user/transformersum/src/abstractive.py", line 123, in __init__
    self.tokenizer.add_tokens(self.rouge_sentence_split_token)
  File "/home/ec2-user/anaconda3/envs/transformersum/lib/python3.9/site-packages/torch/nn/modules/module.py", line 778, in __getattr__
    raise ModuleAttributeError("'{}' object has no attribute '{}'".format(
torch.nn.modules.module.ModuleAttributeError: 'AbstractiveSummarizer' object has no attribute 'tokenizer'

Wondering if this is related to package version inconsistencies or something else? Cheers.

TransformerSum - Tutorial example is not running?

I'm getting this error after running:
from extractive import ExtractiveSummarizer

ImportError Traceback (most recent call last)
in ()
----> 1 from extractive import ExtractiveSummarizer
2 #!python predictions_website.py

11 frames
/usr/local/lib/python3.7/dist-packages/torchtext/vocab.py in ()
11 from typing import Dict, List, Optional, Iterable
12 from collections import Counter, OrderedDict
---> 13 from torchtext._torchtext import (
14 Vocab as VocabPybind,
15 )

ImportError: /usr/local/lib/python3.7/dist-packages/torchtext/_torchtext.so: undefined symbol: _ZN2at6detail10noopDeleteEPv

NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.

error when training an extractive summarization model

I have tried O1, O2, etc… but still reported an error. How to set "amp_level" parameter? When don't support with amp_backend='apex'.

Why tokenize twice?

I'm trying to adapt TransformerSum to a non-English custom dataset and currently very confused about this code in extractive.py:

TransformerSum/src/extractive.py

Lines 1093 to 1107 in 15bd11d

    
           if tokenized: 
        
               src_txt = [ 
        
                   " ".join([token.text for token in sentence if str(token) != "."]) + "." 
        
                   for sentence in input_sentences 
        
               ] 
        
           else: 
        
               nlp = English() 
        
               sentencizer = nlp.create_pipe("sentencizer") 
        
               nlp.add_pipe(sentencizer) 
        
               src_txt = [ 
        
                   " ".join([token.text for token in nlp(sentence) if str(token) != "."]) 
        
                   + "." 
        
                   for sentence in input_sentences 
        
               ]

Why separate the words with spaces, when the resulting string is then tokenized using the tokenizer from the transformers library? I assume those tokenizers are not usually trained on pre-tokenized text, and neither are the pretrained models?
Why remove the space before "." characters, but not anywhere else?

Thanks for any explanations.

Found keys that are in the model state dict but not in the checkpoint

I'm glad to find such a good project.
I tried running ExtractiveSummarizer.load_from_checkpoint(my_model_path, strict=False) and it triggers an error:
UserWarning: Found keys that are in the model state dict but not in the checkpoint: ['word_embedding_model.embeddings.position_ids']
rank_zero_warn(
I search for the previous issues to solve it , but still the error. Thanks !

Cannot train the longformer-base-4096 in CNN/DM dataset

Hi,

I created the environment by

conda env create --file environment.yml

My environment.yml:

name: transformersum
channels:
- conda-forge
- pytorch
dependencies:
- python==3.8.6
- pytorch
- scikit-learn
- tensorboard
- spacy
- sphinx
- pip
- pip:
- pytorch_lightning
- transformers==3.0.2
- torch_optimizer
- wandb
- rouge-score
- packaging
- datasets
- gradio

Then I downloaded the CNN/DM dataset for the longformer-base-4096 from
https://drive.google.com/uc?id=1438kLkTC9zc9otkA7Q7sJqDdCxBrfWqj

Next, I run the convert_extractive_pt_to_txt.py in the scripts folder and get the CNN/DM dataset (.txt).

Finally, I trained the longformer model in my 3090 GPU by

python main.py --data_path ../datasets/cnn_dailymail_processor --model_name_or_path allenai/longformer-base-4096 --model_type longformer --weights_save_path ../trained_models --do_train --max_steps 5000

and get an error:

Validation sanity check: 0it [00:00, ?it/s]2020-12-14 19:32:07,863|transformers.modeling_longformer|INFO> Input ids are automatically padded from 1315 to 1536 to be a multiple of config.attention_window: 512
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [0,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [1,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [2,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [3,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [4,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [5,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [6,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [7,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [8,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [9,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [10,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [11,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [12,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [13,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [14,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [15,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [16,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [17,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [18,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [19,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [20,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [21,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [22,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [23,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [24,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [25,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [26,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [27,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [28,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [29,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [30,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [162,0,0], thread: [31,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [0,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [1,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [2,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [3,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [4,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [5,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [6,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [7,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [8,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [9,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [10,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [11,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [12,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [13,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [14,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [15,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [16,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [17,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [18,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [19,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [20,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [21,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [22,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [23,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [24,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [25,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [26,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [27,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [28,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [29,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [30,0,0] Assertion srcIndex < srcSelectDimSize failed.
/opt/conda/conda-bld/pytorch_1607370172916/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [174,0,0], thread: [31,0,0] Assertion srcIndex < srcSelectDimSize failed.
Traceback (most recent call last):
File "main.py", line 408, in
main(main_args)
File "main.py", line 95, in main
trainer.fit(model)
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 470, in fit
results = self.accelerator_backend.train()
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/pytorch_lightning/accelerators/gpu_accelerator.py", line 66, in train
results = self.train_or_test()
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 65, in train_or_test
results = self.trainer.train()
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 492, in train
self.run_sanity_check(self.get_model())
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 690, in run_sanity_check
_, eval_results = self.run_evaluation(test_mode=False, max_batches=self.num_sanity_val_batches)
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 606, in run_evaluation
output = self.evaluation_loop.evaluation_step(test_mode, batch, batch_idx, dataloader_idx)
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 178, in evaluation_step
output = self.trainer.accelerator_backend.validation_step(args)
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/pytorch_lightning/accelerators/gpu_accelerator.py", line 84, in validation_step
return self._step(self.trainer.model.validation_step, args)
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/pytorch_lightning/accelerators/gpu_accelerator.py", line 76, in _step
output = model_step(*args)
File "/home/admin02/code/sss/longformer/transformersum/src/extractive.py", line 762, in validation_step
outputs, mask = self.forward(**batch)
File "/home/admin02/code/sss/longformer/transformersum/src/extractive.py", line 284, in forward
outputs = self.word_embedding_model(**inputs, **kwargs)
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/transformers/modeling_longformer.py", line 1000, in forward
encoder_outputs = self.encoder(
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/transformers/modeling_longformer.py", line 695, in forward
layer_outputs = layer_module(hidden_states, attention_mask, output_attentions,)
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/transformers/modeling_longformer.py", line 658, in forward
self_attn_outputs = self.attention(hidden_states, attention_mask, output_attentions=output_attentions,)
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/transformers/modeling_longformer.py", line 642, in forward
self_outputs = self.self(hidden_states, attention_mask, output_attentions,)
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/transformers/modeling_longformer.py", line 299, in forward
is_global_attn = any(is_index_global_attn.flatten())
RuntimeError: CUDA error: device-side assert triggered

Then I used CPU and get another error:

Validation sanity check: 0it [00:00, ?it/s]2020-12-14 19:32:52,152|transformers.modeling_longformer|INFO> Input ids are automatically padded from 1315 to 1536 to be a multiple of config.attention_window: 512
Traceback (most recent call last):
File "main.py", line 408, in
main(main_args)
File "main.py", line 95, in main
trainer.fit(model)
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 470, in fit
results = self.accelerator_backend.train()
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/pytorch_lightning/accelerators/cpu_accelerator.py", line 61, in train
results = self.train_or_test()
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 65, in train_or_test
results = self.trainer.train()
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 492, in train
self.run_sanity_check(self.get_model())
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 690, in run_sanity_check
_, eval_results = self.run_evaluation(test_mode=False, max_batches=self.num_sanity_val_batches)
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 606, in run_evaluation
output = self.evaluation_loop.evaluation_step(test_mode, batch, batch_idx, dataloader_idx)
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 178, in evaluation_step
output = self.trainer.accelerator_backend.validation_step(args)
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/pytorch_lightning/accelerators/cpu_accelerator.py", line 77, in validation_step
output = self.trainer.model.validation_step(*args)
File "/home/admin02/code/sss/longformer/transformersum/src/extractive.py", line 762, in validation_step
outputs, mask = self.forward(**batch)
File "/home/admin02/code/sss/longformer/transformersum/src/extractive.py", line 284, in forward
outputs = self.word_embedding_model(**inputs, **kwargs)
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/transformers/modeling_longformer.py", line 996, in forward
embedding_output = self.embeddings(
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/transformers/modeling_roberta.py", line 67, in forward
return super().forward(
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/transformers/modeling_bert.py", line 180, in forward
token_type_embeddings = self.token_type_embeddings(token_type_ids)
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 124, in forward
return F.embedding(
File "/home/admin02/.conda/envs/transformersum/lib/python3.8/site-packages/torch/nn/functional.py", line 1852, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self

My computing environment:
GPU:3090
nvcc -V:11.1
torch:1.7.1
Python:3.8.6
cudatoolkit:11.0.3

Did I get the running environment wrong? Or something else is wrong? I'm not sure.

Thank you in advance.

Python environment inconsistencies?

I used python3.8 in my conda env that loaded your environment.yml. This seems not to work? I got the following error message. Do you only support a specific python version? Or is it a corrupted env? It seems to be somewhat unrelated to transformersum. What do you think? Thx in advance!

$python predictions_website.py
Traceback (most recent call last):
File "predictions_website.py", line 7, in
from extractive import ExtractiveSummarizer
File "/opt/transformersum/src/extractive.py", line 12, in
import pytorch_lightning as pl
File "/opt/anaconda3/envs/transformersum/lib/python3.8/site-packages/pytorch_lightning/init.py", line 59, in
from pytorch_lightning.trainer import Trainer
File "/opt/anaconda3/envs/transformersum/lib/python3.8/site-packages/pytorch_lightning/trainer/init.py", line 18, in
from pytorch_lightning.trainer.trainer import Trainer
File "/opt/anaconda3/envs/transformersum/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 32, in
from pytorch_lightning.loggers import LightningLoggerBase
File "/opt/anaconda3/envs/transformersum/lib/python3.8/site-packages/pytorch_lightning/loggers/init.py", line 59, in
from pytorch_lightning.loggers.wandb import WandbLogger
File "/opt/anaconda3/envs/transformersum/lib/python3.8/site-packages/pytorch_lightning/loggers/wandb.py", line 26, in
import wandb
File "/opt/anaconda3/envs/transformersum/lib/python3.8/site-packages/wandb/init.py", line 37, in
from wandb import sdk as wandb_sdk
File "/opt/anaconda3/envs/transformersum/lib/python3.8/site-packages/wandb/sdk/init.py", line 12, in
from .wandb_init import init # noqa: F401
File "/opt/anaconda3/envs/transformersum/lib/python3.8/site-packages/wandb/sdk/wandb_init.py", line 28, in
from .backend.backend import Backend
File "/opt/anaconda3/envs/transformersum/lib/python3.8/site-packages/wandb/sdk/backend/backend.py", line 15, in
from ..internal.internal import wandb_internal
File "/opt/anaconda3/envs/transformersum/lib/python3.8/site-packages/wandb/sdk/internal/internal.py", line 32, in
from . import sender
File "/opt/anaconda3/envs/transformersum/lib/python3.8/site-packages/wandb/sdk/internal/sender.py", line 17, in
from wandb.filesync.dir_watcher import DirWatcher
File "/opt/anaconda3/envs/transformersum/lib/python3.8/site-packages/wandb/filesync/dir_watcher.py", line 7, in
from watchdog.observers.polling import PollingObserver
File "/opt/anaconda3/envs/transformersum/lib/python3.8/site-packages/watchdog/observers/init.py", line 63, in
from .inotify import InotifyObserver as Observer
File "/opt/anaconda3/envs/transformersum/lib/python3.8/site-packages/watchdog/observers/inotify.py", line 74, in
from .inotify_buffer import InotifyBuffer
File "/opt/anaconda3/envs/transformersum/lib/python3.8/site-packages/watchdog/observers/inotify_buffer.py", line 20, in
from watchdog.observers.inotify_c import Inotify
File "/opt/anaconda3/envs/transformersum/lib/python3.8/site-packages/watchdog/observers/inotify_c.py", line 63, in
libc = _load_libc()
File "/opt/anaconda3/envs/transformersum/lib/python3.8/site-packages/watchdog/observers/inotify_c.py", line 43, in _load_libc
return ctypes.CDLL(libc_path)
File "/opt/anaconda3/envs/transformersum/lib/python3.8/ctypes/init.py", line 373, in init
self._handle = _dlopen(self._name, mode)
OSError: /opt/anaconda3/envs/transformersum/lib/python3.8/site-packages/amp_C.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZN3c104cuda12device_countEv
`

Instruction for fine tune

I try to apply your work to other language.
Is it possible for me to use existing models shown in README, fine-tuning them with my data.

Any suggestion is appreciated.

position_idx missing in state_dict when loading from checkpoint

When usingExtractiveSummarizer.load_from_checkpointor ExtractiveSummarizer.load_weights to load most of the models, i find that the position_ids field is not saved in the checkpoint file, which causes an error. The only model that can be correctly loaded is distilbert-base-uncased-ext-sum

This is the error I get when running predictions_website.py and trying to use bert-base-uncased-ext-sum checkpoints.


Exception happened during processing of request from ('127.0.0.1', 34450)
Traceback (most recent call last):
  File "/home/myuser/anaconda3/envs/transformersum/lib/python3.6/socketserver.py", line 320, in _handle_request_noblock
    self.process_request(request, client_address)
  File "/home/myuser/anaconda3/envs/transformersum/lib/python3.6/socketserver.py", line 351, in process_request
    self.finish_request(request, client_address)
  File "/home/myuser/anaconda3/envs/transformersum/lib/python3.6/socketserver.py", line 364, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/home/myuser/anaconda3/envs/transformersum/lib/python3.6/socketserver.py", line 724, in __init__
    self.handle()
  File "/home/myuser/anaconda3/envs/transformersum/lib/python3.6/http/server.py", line 418, in handle
    self.handle_one_request()
  File "/home/myuser/anaconda3/envs/transformersum/lib/python3.6/http/server.py", line 406, in handle_one_request
    method()
  File "/home/myuser/anaconda3/envs/transformersum/lib/python3.6/site-packages/gradio/networking.py", line 158, in do_POST
    prediction, durations = interface.process(raw_input)
  File "/home/myuser/anaconda3/envs/transformersum/lib/python3.6/site-packages/gradio/interface.py", line 220, in process
    prediction = predict_fn(*processed_input)
  File "/home/myuser/intrical/repos/Intrical-Transformers/predictions_website.py", line 11, in summarize_text
    summarizer = ExtractiveSummarizer.load_from_checkpoint(model_choice)
  File "/home/myuser/anaconda3/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/core/saving.py", line 153, in load_from_checkpoint
    model = cls._load_model_state(checkpoint, *args, strict=strict, **kwargs)
  File "/home/myuser/anaconda3/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/core/saving.py", line 192, in _load_model_state
    model.load_state_dict(checkpoint['state_dict'], strict=strict)
  File "/home/myuser/anaconda3/envs/transformersum/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1045, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for ExtractiveSummarizer:
	Missing key(s) in state_dict: "word_embedding_model.embeddings.position_ids". 
----------------------------------------

How can I correctly load any type of model?

ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found

Hi,

Here's the command I'm trying to run :
python main.py --mode abstractive --model_name_or_path bert-base-uncased --decoder_model_name_or_path bert-base-uncased --cache_file_path data --max_epochs 4 --do_train --do_test --batch_size 4 --weights_save_path model_weights --no_wandb_logger_log_model --accumulate_grad_batches 5 --use_scheduler linear --warmup_steps 8000 --gradient_clip_val 1.0 --custom_checkpoint_every_n 300 --gpus 0

My env :

name: transformersum
channels:
    - conda-forge
    - pytorch
dependencies:
    - pytorch
    - scikit-learn
    - tensorboard
    - spacy
    - spacy-model-en_core_web_sm
    - sphinx
    - pyarrow=4
    - pip
    - pip:
      - pytorch_lightning
      - transformers
      - torch_optimizer
      - click==7.0
      - wandb
      - rouge-score
      - packaging
      - datasets
      - gdown
      - gradio
      - torch==1.8.1+cu111
      - torchvision==0.9.1+cu111
      - torchaudio==0.8.1
      - -f https://download.pytorch.org/whl/torch_stable.html
variables:
    TOKENIZERS_PARALLELISM: true

Here's what I get :

Traceback (most recent call last):
  File "main.py", line 6, in <module>
    import datasets as nlp
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/datasets/__init__.py", line 23, in <module>
    import pyarrow
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pyarrow/__init__.py", line 63, in <module>
    import pyarrow.lib as _lib
ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version GLIBCXX_3.4.26 not found (required by /opt/conda/envs/transformersum/lib/python3.6/site-packages/pyarrow/../../../libarrow.so.400)

I'm currently using Python 3.6.13.

Any help is welcome ! :)

TypeError: init() got an unexpected keyword argument 'gradient_checkpointing'

i am unable to load a checkpoint into my abstractive model , i don't know why . i am using trasnformersum env with conda :

channels:
    - conda-forge
    - pytorch
dependencies:
    - pytorch
    - scikit-learn
    - tensorboard
    - spacy
    - sphinx
    - pyarrow
    - pre-commit
    - pip
    - pip:
      - pytorch_lightning
      - transformers==3.0.2
      - torch_optimizer
      - wandb
      - rouge-score
      - packaging
      - datasets
      - gradio

i use this commande python3 predict_abst.py , here's a what my code looks like :

import glob
import os
import sys

sys.path.insert(0, os.path.abspath("./src"))
from abstractive import AbstractiveSummarizer 

def summarize_text(text, model_choice):
    summarizer = AbstractiveSummarizer.load_from_checkpoint(model_choice)
    # summarizer = AbstractiveSummarizer.load_from_checkpoint(model_choice)
    return summarizer.predict(text)

input = """Johannes Gutenberg (1398 – 1468) was a German goldsmith and publisher who introduced printing to Europe. His introduction of mechanical movable type printing to Europe started the Printing Revolution and is widely regarded as the most important event of the modern period. It played a key role in the scientific revolution and laid the basis for the modern knowledge-based economy and the spread of learning to the masses.

Gutenberg many contributions to printing are: the invention of a process for mass-producing movable type, the use of oil-based ink for printing books, adjustable molds, and the use of a wooden printing press. His truly epochal invention was the combination of these elements into a practical system that allowed the mass production of printed books and was economically viable for printers and readers alike.

In Renaissance Europe, the arrival of mechanical movable type printing introduced the era of mass communication which permanently altered the structure of society. The relatively unrestricted circulation of information—including revolutionary ideas—transcended borders, and captured the masses in the Reformation. The sharp increase in literacy broke the monopoly of the literate elite on education and learning and bolstered the emerging middle class.t """

print(summarize_text(input,"./models/epoch=18-step=2678.ckpt"))```
==============================================================
and i am getting this error :
```python
Traceback (most recent call last):
  File "predict_abst.py", line 19, in <module>
    print(summarize_text(input,"./models/epoch=18-step=2678.ckpt"))
  File "predict_abst.py", line 9, in summarize_text
    summarizer = AbstractiveSummarizer.load_from_checkpoint(model_choice)
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/core/saving.py", line 153, in load_from_checkpoint
    model = cls._load_model_state(checkpoint, strict=strict, **kwargs)
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/pytorch_lightning/core/saving.py", line 195, in _load_model_state
    model = cls(**_cls_kwargs)
  File "/root/project/transformersum/src/abstractive.py", line 113, in __init__
    gradient_checkpointing=self.hparams.gradient_checkpointing,
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/transformers/modeling_auto.py", line 1213, in from_pretrained
    return model_class.from_pretrained(pretrained_model_name_or_path, *model_args, config=config, **kwargs)
  File "/opt/conda/envs/transformersum/lib/python3.6/site-packages/transformers/modeling_utils.py", line 672, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
TypeError: __init__() got an unexpected keyword argument 'gradient_checkpointing'```

Extractive summarization very short

This is a very interesting library @HHousen, but my extractive summaries are always two sentences regardless of input document length. How can I increase the length? I would assume that the extractive decoder would, by design, return the top n sentences over a certain threshold?

Training fails for batch with 1 sentence only

If batch contains source text with only 1 sentence, training fails (specifically, loading examples after data processing). The issue seems to be that output size should have been (2, 1) - instead its (2, 2). Labels size is fine.

Steps to reproduce:

test_one_sent/train.0.json file:

[
    {"src": [["Hello", "."]], "labels": [0]},
    {"src": [["Hi", "."]], "labels": [0]}
]

test_one_sent/val.0.json file:

[
    {"src": [["Hey", "."]], "labels": [0]},
    {"src": [["Hiya", "."]], "labels": [0]}
]

Command to run:

python src/main.py --data_path ./data/test_one_sent/ --weights_save_path ./trained_models --do_train --max_steps 10

Error output:

2020-10-28 16:43:12,731|extractive|INFO> TRAIN_STEP 0
2020-10-28 16:43:12,731|extractive|INFO> batch
{'sent_lengths': [[4, 1], [5, 0]], 'sent_lengths_mask': tensor([[ True, False],
        [ True, False]]), 'input_ids': tensor([[    0, 13368,   479,     2,     0],
        [    0, 30086,  2636,   479,     2]]), 'attention_mask': tensor([[1, 1, 1, 1, 0],
        [1, 1, 1, 1, 1]]), 'token_type_ids': tensor([[0, 0, 0, 1, 0],
        [0, 0, 0, 0, 1]]), 'labels': tensor([[0],
        [0]]), 'sent_rep_token_ids': tensor([[0],
        [0]]), 'sent_rep_mask': tensor([[True],
        [True]])}
2020-10-28 16:43:12,758|extractive|ERROR> Target size (torch.Size([2, 1])) must be the same as input size (torch.Size([2, 2]))
2020-10-28 16:43:12,759|extractive|ERROR> Details about above error:
1. outputs=tensor([[-0.1405, -0.1505],
        [-0.1405, -0.1505]])
labels.float()=tensor([[0.],
        [0.]])
Traceback (most recent call last):
  File "src/main.py", line 403, in <module>
    main(main_args)
  File "src/main.py", line 97, in main
    trainer.fit(model)
.......
  File "/Users/salmanmohammed/dev/TransformerSum/src/extractive.py", line 326, in compute_loss
    loss = loss * mask.float()
UnboundLocalError: local variable 'loss' referenced before assignment

ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on.

How can I run model without network, when I run it offline, it while show the error as follows:

Traceback (most recent call last):
File "/root/anaconda3/envs/transformersum/lib/python3.6/site-packages/transformers/models/auto/tokenization_auto.py", line 328, in get_tokenizer_config
use_auth_token=use_auth_token,
File "/root/anaconda3/envs/transformersum/lib/python3.6/site-packages/transformers/file_utils.py", line 1412, in cached_path
local_files_only=local_files_only,
File "/root/anaconda3/envs/transformersum/lib/python3.6/site-packages/transformers/file_utils.py", line 1628, in get_from_cache
"Connection error, and we cannot find the requested files in the cached path."
ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on.

ModuleNotFoundError: No module named 'extractive'

I followed the step-by-step instructions to install, but the environment.yml does not help to install the extractive module?

dataloader always can't find data after preparing data for extractive summary

I have noticed this happen consistently -- if the data is not in .pt format and needs to be prepared, if I run the training script (main.py), it will prepare the data, and then fail with dataloader can't find file. But at this point the .pt files are prepared and I just run the same command again and they are found. It is a minor issue but just FYI.

TypeError: forward() got an unexpected keyword argument 'source'

Hi there!

I was trying to finetune the distillbert model following the command from https://wandb.ai/hhousen/transformerextsum/runs/296s2066/overview
and removing all unavailable tags.
I used my custom json dataset following the guideline at https://transformersum.readthedocs.io/en/latest/extractive/convert-to-extractive.html
and txt files were properly generated by your code.
However, I run into this problem when I try to finetune the model.
Any idea why this could happening? Thank you in advance!

Training command:
python main.py --model_name_or_path distilbert-base-uncased --no_use_token_type_ids --pooling_mode sent_rep_tokens --data_path ../datasets/new_data/ --max_epochs 3 --accumulate_grad_batches 2 --warmup_steps 1800 --gradient_clip_val 1.0 --optimizer_type adamw --use_scheduler linear --do_train --do_test --data_type txt --dataloader_type map

Full logger:
Sanity Checking DataLoader 0: 0%| | 0/1 [00:00<?, ?it/s]Traceback (most recent call last):
File "/home/ubuntu/TransformerSum/src/main.py", line 509, in
main(main_args)
File "/home/ubuntu/TransformerSum/src/main.py", line 137, in main
trainer.fit(model)
File "/home/ubuntu/miniconda3/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 770, in fit
self._call_and_handle_interrupt(
File "/home/ubuntu/miniconda3/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 723, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/home/ubuntu/miniconda3/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 811, in _fit_impl
results = self._run(model, ckpt_path=self.ckpt_path)
File "/home/ubuntu/miniconda3/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1236, in _run
results = self._run_stage()
File "/home/ubuntu/miniconda3/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1323, in _run_stage
return self._run_train()
File "/home/ubuntu/miniconda3/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1345, in _run_train
self._run_sanity_check()
File "/home/ubuntu/miniconda3/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1413, in _run_sanity_check
val_loop.run()
File "/home/ubuntu/miniconda3/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 204, in run
self.advance(*args, **kwargs)
File "/home/ubuntu/miniconda3/lib/python3.9/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 154, in advance
dl_outputs = self.epoch_loop.run(self._data_fetcher, dl_max_batches, kwargs)
File "/home/ubuntu/miniconda3/lib/python3.9/site-packages/pytorch_lightning/loops/base.py", line 204, in run
self.advance(*args, **kwargs)
File "/home/ubuntu/miniconda3/lib/python3.9/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 128, in advance
output = self._evaluation_step(**kwargs)
File "/home/ubuntu/miniconda3/lib/python3.9/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 226, in _evaluation_step
output = self.trainer._call_strategy_hook("validation_step", *kwargs.values())
File "/home/ubuntu/miniconda3/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 1765, in _call_strategy_hook
output = fn(*args, **kwargs)
File "/home/ubuntu/miniconda3/lib/python3.9/site-packages/pytorch_lightning/strategies/strategy.py", line 344, in validation_step
return self.model.validation_step(*args, **kwargs)
File "/home/ubuntu/TransformerSum/src/extractive.py", line 778, in validation_step
outputs, mask = self.forward(**batch)
File "/home/ubuntu/TransformerSum/src/extractive.py", line 275, in forward
outputs = self.word_embedding_model(**inputs, **kwargs)
File "/home/ubuntu/miniconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
TypeError: forward() got an unexpected keyword argument 'source'

RuntimeError: Error(s) in loading state_dict for ExtractiveSummarizer?

I'm getting this error when I give it the model checkpoint.

Code:
model = ExtractiveSummarizer.load_from_checkpoint(path)

error:
2 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict)
1405 if len(error_msgs) > 0:
1406 raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
-> 1407 self.class.name, "\n\t".join(error_msgs)))
1408 return _IncompatibleKeys(missing_keys, unexpected_keys)
1409

RuntimeError: Error(s) in loading state_dict for ExtractiveSummarizer:
Missing key(s) in state_dict: "word_embedding_model.embeddings.position_ids".

Missing key(s) in state_dict

Good morning,

First of all thank you for the very interesting project.

I was trying to getting started with the repository using roberta model to perform extractive summarization of simple text, the problem is related to the loading of the model, I report the trace of the error hereafter:

Traceback (most recent call last):
  File "/home/moreno/transformersum/transformersum/src/example_extractive.py", line 4, in <module>
    model = ExtractiveSummarizer.load_from_checkpoint("../models/roberta-base-ext-sum")
  File "/home/moreno/anaconda3/envs/transformersum/lib/python3.9/site-packages/pytorch_lightning/core/saving.py", line 158, in load_from_checkpoint
    model = cls._load_model_state(checkpoint, strict=strict, **kwargs)
  File "/home/moreno/anaconda3/envs/transformersum/lib/python3.9/site-packages/pytorch_lightning/core/saving.py", line 204, in _load_model_state
    model.load_state_dict(checkpoint['state_dict'], strict=strict)
  File "/home/moreno/anaconda3/envs/transformersum/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for ExtractiveSummarizer:
        Missing key(s) in state_dict: "word_embedding_model.embeddings.position_ids".

Do you have an idea why it is happening?

Unable to reproduce example

Hi!
I am trying to use the library for some experiments on extractive summarization.

I installed it as suggested in the guide and downloaded the roberta-base-ext-sum model.
However, when I try to

from extractive import ExtractiveSummarizer
model =ExtractiveSummarizer.load_from_checkpoint("path/epoch=3.ckpt")

I get the following exception:

File "C:\Users\silvia\Desktop\transformersum\prova.py", line 4, in
model = ExtractiveSummarizer.load_from_checkpoint("C:/Users/silvia/Desktop/transformersum/models/epoch=3.ckpt")
File "C:\Users\silvia\Anaconda3\envs\transformersum\lib\site-packages\pytorch_lightning\core\saving.py", line 153, in load_from_checkpoint
model = cls._load_model_state(checkpoint, strict=strict, **kwargs)
File "C:\Users\silvia\Anaconda3\envs\transformersum\lib\site-packages\pytorch_lightning\core\saving.py", line 201, in _load_model_state
keys = model.load_state_dict(checkpoint["state_dict"], strict=strict)
File "C:\Users\silvia\Anaconda3\envs\transformersum\lib\site-packages\torch\nn\modules\module.py", line 1406, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for ExtractiveSummarizer:
Missing key(s) in state_dict: "word_embedding_model.embeddings.position_ids".

How should I proceed?

Are the pretrained models of extractive summarization having the same metrics(Rouge score etc) on the datasets (WikiHow/Arxiv) as illustrated in model-results's CSV file ?

Hi @HHousen,
I was wondering if you can tell if the released checkpoints of Extractive models same as the one shown to output the metrics in the CSV file listed here

Fine-tuning/Inference commands for "roberta-base-ext-sum"

Hi Hayden - thanks for making this repo (it's very helpful). I'm trying to re-create the best Extractive models on my own machines so I can modify. Can you help me locate the roberta-base-ext-sum model training commands for CNN/DM on the wandb page? Thanks!!

Abstractive BART Model , RuntimeError: The size of tensor a (64000) must match the size of tensor b (64001) at non-singleton dimension 1

Hi, I've got an error when fine tune Abstractive BART model

 | Name      | Type                          | Params
------------------------------------------------------------
0 | model     | MBartForConditionalGeneration | 420 M
1 | loss_func | LabelSmoothingLoss            | 0
------------------------------------------------------------
420 M     Trainable params
0         Non-trainable params
420 M     Total params
1,681.445 Total estimated model params size (MB)
Validation sanity check:   0%|                                                                                                          | 0/2 [00:00<?, ?it/s]Traceback (most recent call last):
 File "main.py", line 490, in <module>
   main(main_args)
 File "main.py", line 125, in main
   trainer.fit(model)
 File "/data/env/train_env/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 552, in fit
   self._run(model)
 File "/data/env/train_env/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 922, in _run
   self._dispatch()
 File "/data/env/train_env/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 990, in _dispatch
   self.accelerator.start_training(self)
 File "/data/env/train_env/lib/python3.7/site-packages/pytorch_lightning/accelerators/accelerator.py", line 92, in start_training
   self.training_type_plugin.start_training(trainer)
 File "/data/env/train_env/lib/python3.7/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 161, in start_training
   self._results = trainer.run_stage()
 File "/data/env/train_env/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1000, in run_stage
   return self._run_train()
 File "/data/env/train_env/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1035, in _run_train
   self._run_sanity_check(self.lightning_module)
 File "/data/env/train_env/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1122, in _run_sanity_check
   self._evaluation_loop.run()
 File "/data/env/train_env/lib/python3.7/site-packages/pytorch_lightning/loops/base.py", line 111, in run
   self.advance(*args, **kwargs)
 File "/data/env/train_env/lib/python3.7/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 111, in advance
   dataloader_iter, self.current_dataloader_idx, dl_max_batches, self.num_dataloaders
 File "/data/env/train_env/lib/python3.7/site-packages/pytorch_lightning/loops/base.py", line 111, in run
   self.advance(*args, **kwargs)
 File "/data/env/train_env/lib/python3.7/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 111, in advance
   output = self.evaluation_step(batch, batch_idx, dataloader_idx)
 File "/data/env/train_env/lib/python3.7/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 158, in evaluation_step
   output = self.trainer.accelerator.validation_step(step_kwargs)
 File "/data/env/train_env/lib/python3.7/site-packages/pytorch_lightning/accelerators/accelerator.py", line 211, in validation_step
   return self.training_type_plugin.validation_step(*step_kwargs.values())
 File "/data/env/train_env/lib/python3.7/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 178, in validation_step
   return self.model.validation_step(*args, **kwargs)
 File "/data/summary_to_title/transformersum/src/abstractive.py", line 709, in validation_step
   cross_entropy_loss = self._step(batch)
 File "/data/summary_to_title/transformersum/src/abstractive.py", line 694, in _step
   outputs = self.forward(source, target, source_mask, target_mask, labels=labels)
 File "/data/summary_to_title/transformersum/src/abstractive.py", line 256, in forward
   loss = self.calculate_loss(prediction_scores, labels)
 File "/data/summary_to_title/transformersum/src/abstractive.py", line 674, in calculate_loss
   prediction_scores.view(-1, self.model.config.vocab_size), labels.view(-1)
 File "/data/env/train_env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
   return forward_call(*input, **kwargs)
 File "/data/summary_to_title/transformersum/src/helpers.py", line 282, in forward
   return F.kl_div(output, model_prob, reduction="batchmean")
 File "/data/env/train_env/lib/python3.7/site-packages/torch/nn/functional.py", line 2753, in kl_div
   reduced = torch.kl_div(input, target, reduction_enum, log_target=log_target)
RuntimeError: The size of tensor a (64000) must match the size of tensor b (64001) at non-singleton dimension 1

This is parameter when initialize

python main.py \
--mode abstractive \
--model_name_or_path vinai/bartpho-word \
--max_epochs 50 \
--model_max_length 100 \
--dataset /data/summary_to_title/transformersum/data/train/train.arrow /data/summary_to_title/transformersum/data/val/val.arrow /data/summary_to_title/transformersum/data/test/test.arrow \
--data_example_column content \
--data_summarized_column title \
--cache_file_path /data/summary_to_title/transformersum/data \
--do_train \
--do_test \
--batch_size 4 \
--val_batch_size 8 \
--weights_save_path model_weights \
--use_logger wandb \
--wandb_project bartpho_word_sum \
--no_wandb_logger_log_model \
--accumulate_grad_batches 5 \
--learning_rate 3e-4 \
--use_scheduler linear \
--warmup_steps 8000 \
--gradient_clip_val 1.0 \
--split_char ^

AttributeError: '_LazyAutoMapping' object has no attribute '_mapping'

For

from transformer.models.auto.modeling_auto import MODEL_MAPPING

I get the following error:

AttributeError: '_LazyAutoMapping' object has no attribute '_mapping'

The _LazyAutoMapping class does indeed not have the attribute _mapping (see https://github.com/huggingface/transformers/blob/master/src/transformers/models/auto/auto_factory.py). However, it has a function def iter(self) that is supposed to return iter(self._mapping.keys()), which does not exist in init(). Am I missing something here or is this an error that needs to be fixed?

This is the error message:

 File "/content/drive/MyDrive/SharedColabNotebooks/Code/transformersum/src/main.py", line 8, in <module>
    from extractive import ExtractiveSummarizer
 File "/content/drive/.shortcut-targets-by- 
    id/1AslFCJkKFwmDS9rtbO_CAdHbBWuL1I4A/SharedColabNotebooks/Code/transformersum/src/extractive.py", line 47, in <module>
     [m.model_type for m in MODEL_MAPPING]
 File "/usr/local/envs/transformersum_test/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 528, in __iter__
       return iter(self._mapping.keys())
AttributeError: '_LazyAutoMapping' object has no attribute '_mapping'

Any help is greatly appreciated!

	if tokenized:
	src_txt = [
	" ".join([token.text for token in sentence if str(token) != "."]) + "."
	for sentence in input_sentences
	]
	else:
	nlp = English()
	sentencizer = nlp.create_pipe("sentencizer")
	nlp.add_pipe(sentencizer)

	src_txt = [
	" ".join([token.text for token in nlp(sentence) if str(token) != "."])
	+ "."
	for sentence in input_sentences
	]