Code Monkey home page Code Monkey logo

summ-n's People

Contributors

chatc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

summ-n's Issues

Result for SummScreen FD

Thanks for sharing the code. It was very helpful.

I tried replicating the result for SummScreen FD dataset without making any changes to the code and the config.

I got the following results:
F_measure: [23.54, 4.42, 20.88] Recall: [20.59, 3.84, 18.22] Precision: [32.85, 6.24, 29.31]

The result in the paper matches closely with ROUGE Precision scores instead of ROUGE-F1. Can you verify this or suggest a solution if I am doing something wrong?

Thanks!

Errors encountered during training a summarizer.

I encounter the problem in run.py line 77:
/workspace/shared/apps/anaconda3/envs/summarization/bin/python: Error while finding module specification for 'examples.roberta.multiprocessing_bpe_encoder' (ModuleNotFoundError: No module named 'examples'

HMNet Outputs for human evaluation

In the paper you present a human evaluation comparing your model against HMNet. Did you run HMNet yourself to get the outputs or did you find them somewhere?

Could you share both your system outputs as well as HMNet's on AMI and ICSI?

Bugs preventing reproducing results

The code is broken in a number of places, making it impossible to reproduce results:

  • There is no way to specify args.mode in run.py other than editing it into the script
    • passing --mode train to run.py does not work, because that argument will be stored in args.train.mode, not args.mode.
    • The same is true for args.checkpoint_dir for inference.
  • In gen_summary/inference.py:38, you assign to self.bart.cfg.dataset.batch_size_valid = bsz. I'm not sure what this is supposed to do, but the bart object comes from fairseq and has no cfg member.
  • in stage 2, run.py does not actually use the output from stage 1 and will just create the same input/output as stage 1 again
    • the others are one-line fixes but for this one I am not sure how to fix

For others who are also trying to reproduce:

  • The required fairseq version is actually 0.10.0, there is no 1.10.0
    • You also need to copy the examples folder somewhere and add its parent folder to PYTHONPATH. You must move it out of the main fairseq dir, or the fairseq logging module will override the Python builtin.
  • Since AnyROUGE does not provide any installation instructions: Clone the library and add its root folder to the PYTHONPATH.
  • Then clone https://github.com/pltrdy/rouge into the ThirdParty folder inside AnyRouge. If you put it somewhere else or pip install it, you will need to edit the scripts to change imports.

OSError: Model file not found: ./output/AMI/stage_1/trainer_output/checkpoints/checkpoint_best.pt

Hi author, regarding the Summ-N framework, I have some problems in the very first stage of training the model, at first, I configured the environments as required, but when I utilise the first dataset AMI script file.sh, i.e. go to the Summ-N-main main file, and run the bash scripts/run_AMI.sh command, it will always run to test mode in the run.py file and return an error. I always run the run.py file to the test mode, and return an error, but the default mode should not be the training mode, then I will run.py in the args.mode variable assigned to "train", which can enter the training mode, but there is the following problem: OSError: Model file not found:. /output/AMI/stage_1/trainer_output/checkpoints/checkpoint_best.pt, which means I'm missing the model weights file, but I'm not sure why it shows that the .pt file is missing, and I can't get the previous folder checkpoints to work either. open, it shows no files. I'd be grateful if the author could answer this question sometime, thanks!

Problem in reproducing

Hi

I can't reproduce this repository
Because there is an unknown import module library from ThirdParty.rouge.rouge.rouge_score in "segmentor_core.py" file

import nltk
from rouge import Rouge
from ThirdParty.rouge.rouge.rouge_score import *
from utils.tools import download_nltk

I checked the AnyROUGE repository, but there is no such module.
Can you check about this issue?

Thank you so much in advance!

ThirdParty module not found

Import "ThirdParty.rouge.rouge.rouge_score" in the segmentor_core.py file, but it was not found in ThirdParty

name 'Ngrams' is not defined

I can't replicate the code. When I tried to run !bash scripts/run_AMI.sh, it throws the following error:
image
Does anyone know why this happens? Thanks :)

The order of the prediction result

Hi, I am glad you release some of the prediction result. Really helps me a lot to follow your work.
However I have one question, what's the order of the prediction result? Can you tell me which test file corresponds to each output?

Some issues encountered in reproduces the code

Hello, I encountered the following issue while trying to reproduce your code:

[nltk_data] Downloading package punkt to /home/isaac/nltk_data...
[nltk_data] Package punkt is already up-to-date!
[nltk_data] Downloading package punkt to /home/isaac/nltk_data...
[nltk_data] Package punkt is already up-to-date!
Finish loading stage 0 dataset!
Train size: 97
Val size: 20
Test size: 20
Start target matching of Stage 1. This may take several minutes.
637it [00:00, 1936066.41it/s]
122it [00:00, 1336044.62it/s]
139it [00:00, 1362168.82it/s]
Finish loading stage 1 dataset!
Train size: 637
Val size: 122
Test size: 139
2023-06-06 03:29:21 | INFO | fairseq.file_utils | Archive name 'stage_1/trainer_output' was not found in archive name list. We assumed 'stage_1/trainer_output' was a path or URL but couldn't find any file associated to this path or URL.
Traceback (most recent call last):
File "run.py", line 91, in
summary_generator = SummaryGenerator(args, split_source, fine_grained=False, test_mode=True)
File "/home/isaac/Summ-N/models/gen_summary/inference.py", line 18, in init
self.stage_cfg.trainer_output_folder,
File "/home/isaac/.conda/envs/summ-n/lib/python3.7/site-packages/fairseq/models/bart/model.py", line 122, in from_pretrained
**kwargs,
File "/home/isaac/.conda/envs/summ-n/lib/python3.7/site-packages/fairseq/hub_utils.py", line 55, in from_pretrained
kwargs["data"] = os.path.abspath(os.path.join(model_path, data_name_or_path))
File "/home/isaac/.conda/envs/summ-n/lib/python3.7/posixpath.py", line 80, in join
a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not NoneType
scripts/run_AMI.sh: 19: --cuda-devices: not found

The error seems to occur in the initialization of BARTModel.from_pretrained in inference.py and the issue is TypeError: expected str, bytes or os.PathLike object, not NoneType:

self.bart = BARTModel.from_pretrained(
self.stage_cfg.trainer_output_folder,
checkpoint_file='checkpoints/checkpoint_best.pt',
data_name_or_path="./bin",
)

Question 1: Do you have any idea what might have caused this? How can I resolve it? Should I modify the data_name_or_path variable?
Question 2: What does the error scripts/run_AMI.sh: 19: --cuda-devices: not found mean?
Question 3: Why “2023-06-06 03:29:21 | INFO | fairseq.file_utils | Archive name 'stage_1/trainer_output' was not found in archive name list. We assumed 'stage_1/trainer_output' was a path or URL but couldn't find any file associated to this path or URL.”?

Thank you very much!

Version of fairseq

Hello, I saw that the version number of fairseq you wrote needs 1.10.0, but I did not find 1.10.0 in the historical version of fairseq, is it version 0.10.0? Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.