Code Monkey home page Code Monkey logo

sescore's Introduction

SEScore

Description

In this repo we explore different methods to improve the already great SEScore evaluation metric.

Background

SEScore is a reference-based text-generation evaluation metric that requires no pre-human-annotated error data, described in the paper Not All Errors are Equal: Learning Text Generation Metrics using Stratified Error Synthesis.
Generally speaking, the paper describes a stratified dataset synthesis pipeline where sentences get corrupted via pre-defined methods, then the newly corrupted sentences receive a score that represends how "severe" the corruption was via bi-directional entailment and finally we train a NN to learn the scores accumulated by the bi-directional entailment model.
While this method performs very well and has improved upon the SOTA, we believe there is room for improvement.

Suggested improvements

  1. The paper describes a stratified way to accumulated errors via corruption of the sentences. The corruption of the sentences occurs by Adding/Replacing/Deleting/Swapping tokens in the original sentence. While effective, recent papers showed more effective masking techniques which could help create more meaningfull corruption. Our first proposal would be to use PMI masking instead of token masking in the corruption of the sententces.
  2. The severity score used in the paper followed the MQM metric of assessing the severity of errors in text. Again, while this metric has been based in many papers, the accumulative nature of the suggested severity score causes it to suffer from monotinicity, which could not accurately represent the changes happening in the newly corrupted sentece. Additionally, the metric is discrete and this is could lead loss of information when attributing severity to an error.
    We propose two changes to the severity score metric which will allow it to be non-monotonic and also continuous. For more details please refer to the Research Proposal.pptx file

Results

NLP poster with graph

How to run?

Run new_xlm_mbart_data.py for English:

python3 new_xlm_mbart_data.py -num_var 10 -lang en_XX -src case_study_src -ref case_study_ref -save save_file_name -severity ['original','2_1','2_2'] -whole_words True

sescore's People

Contributors

transformmaster avatar ofekglick avatar yonip97 avatar xu1998hz avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.