Code Monkey home page Code Monkey logo

long-tailed's Introduction

On Long-Tailed Phenomena in NMT. Findings of EMNLP 2020.

Warning: The crux of the code, the Focal loss and Anti-Focal loss implementations are available in the fairseq/criterions directory and can be directly used with fairseq. However, the end-to-end code currently is more of a code-dump, rather than a code-release. Please wait for a (much better) cleaned-up version.

We use fairseq to train the models. Our code is tested on Ubuntu 18.04, with a Conda installation of Python 3.6.

git clone https://github.com/vyraun/long-tailed.git
pip install .

Other Repositories Used (thanks!):

Steps to Replicate

Below are the steps to replicate each section of the paper.

Section 1: Train the Cross-Entropy Baseline Transformer

The scripts with the prefix 'run' provides the code, from data preparation to evaluation. For example:

bash run_iwslt14_de_en.sh

Compute the Spearman's Rank Correlation between Norms and Frequencies:

python norm.py

Section 2: Characterizing the Long Tail

cd analysis
bash evauate_splits.sh [model_dir]
bash evauate_model_on_splits.sh [model_dir]

The plot can be generated using compare-mt

Section 3: Analyze Beam Search

bash evaluate.sh model_dir data_dir
python probs_new.py beam_search.pkl
python probs_all.py [beam_search_*.pkl]

Section 4: Train Transformer using Focal and Anti-Focal Losses

The loss functions are implemented in the Criterions Directory.

bash run_iwslt14_de_fc.sh
bash run_iwslt14_de_afc.sh

Section 5: Tau Normalization Baseline

cd analysis
bash normalization.sh

Citation

@inproceedings{raunak2020longtailed,
  title = {On Long-Tailed Phenomena in Neural Machine Translation},
  author = {Raunak, Vikas and Dalmia, Siddharth and Gupta, Vivek and Metze, Florian},
  booktitle = {Findings of EMNLP},
  year = 2020,
}

long-tailed's People

Contributors

vyraun avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.