Code Monkey home page Code Monkey logo

pt.seq2seq's Introduction

Seq2seq

PyTorch implementations of seq2seq models for Neural Machine Translation (NMT) task:

  • seq2seq (RNN)
  • seq2seq with attention (RNN + attention)
  • ConvS2S
  • Transformer
  • DynamicConv (+ LightConv)

No-torchtext version

Please refer to no-torchtext tag. In this version, the dataset.py, lang.py and data_prepare.py structuralize low-level text to make it easier to use in the training code.

Supporting datasets

Supporting datasets include pytorch tutorial ENG to FRA translation dataset and torchtext NMT datasets.

Dependencies

  • python3
  • pyyaml
  • pytorch >= 1.10
  • tensorboard >= 1.14
  • torchtext
  • spacy
    • python -m spacy download en
    • python -m spacy download de

Results

Hparams:

  • Task & data: ENG to FRA translation task, max_len=14, min_freq=2.

Models:

Model Loss (sum) PPL BLEU* Note
Seq2Seq 15.11 6.320
Seq2Seq + KV attn 13.57 5.244 64.10
Seq2Seq + Additive attn 13.28 5.054 64.48
Seq2Seq + Multiplicative attn 14.01 5.526
ConvS2S 13.06 4.931 61.62
ConvS2S + out-caching 12.44 4.572 60.90
Transformer-init 12.73 4.675 66.38
LightConv 12.29 4.493 K=[3,3,5,5,7,7]
DynamicConv 11.81 4.237 68.35 K=[3,3,5,5,7,7]
  • [!] BLEU is recorded in different run
  • PPL and BLEU does not match
  • about the Transformer
    • after-norm does not work; should use before-norm.
    • LR warmup and xavier init is important for the performance

ToDo

  • Beam search
  • Word tokenization
    • BPE
    • Word piece model

References

pt.seq2seq's People

Contributors

khanrc avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.