Code Monkey home page Code Monkey logo

seq2seq.pytorch's Introduction

Seq2Seq in PyTorch

This is a complete suite for training sequence-to-sequence models in PyTorch. It consists of several models and code to both train and infer using them.

Using this code you can train:

  • Neural-machine-translation (NMT) models
  • Language models
  • Image to caption generation
  • Skip-thought sentence representations
  • And more...

Installation

git clone --recursive https://github.com/eladhoffer/seq2seq.pytorch
cd seq2seq.pytorch; python setup.py develop

Models

Models currently available:

Datasets

Datasets currently available:

All datasets can be tokenized using 3 available segmentation methods:

  • Character based segmentation
  • Word based segmentation
  • Byte-pair-encoding (BPE) as suggested by bpe with selectable number of tokens.

After choosing a tokenization method, a vocabulary will be generated and saved for future inference.

Training methods

The models can be trained using several methods:

  • Basic Seq2Seq - given encoded sequence, generate (decode) output sequence. Training is done with teacher-forcing.
  • Multi Seq2Seq - where several tasks (such as multiple languages) are trained simultaneously by using the data sequences as both input to the encoder and output for decoder.
  • Image2Seq - used to train image to caption generators.

Usage

Example training scripts are available in scripts folder. Inference examples are available in examples folder.

  • example for training a transformer on WMT16 according to original paper regime:
DATASET=${1:-"WMT16_de_en"}
DATASET_DIR=${2:-"./data/wmt16_de_en"}
OUTPUT_DIR=${3:-"./results"}

WARMUP="4000"
LR0="512**(-0.5)"

python main.py \
  --save transformer \
  --dataset ${DATASET} \
  --dataset-dir ${DATASET_DIR} \
  --results-dir ${OUTPUT_DIR} \
  --model Transformer \
  --model-config "{'num_layers': 6, 'hidden_size': 512, 'num_heads': 8, 'inner_linear': 2048}" \
  --data-config "{'moses_pretok': True, 'tokenization':'bpe', 'num_symbols':32000, 'shared_vocab':True}" \
  --b 128 \
  --max-length 100 \
  --device-ids 0 \
  --label-smoothing 0.1 \
  --trainer Seq2SeqTrainer \
  --optimization-config "[{'step_lambda':
                          \"lambda t: { \
                              'optimizer': 'Adam', \
                              'lr': ${LR0} * min(t ** -0.5, t * ${WARMUP} ** -1.5), \
                              'betas': (0.9, 0.98), 'eps':1e-9}\"
                          }]"
  • example for training attentional LSTM based model with 3 layers in both encoder and decoder:
python main.py \
  --save de_en_wmt17 \
  --dataset ${DATASET} \
  --dataset-dir ${DATASET_DIR} \
  --results-dir ${OUTPUT_DIR} \
  --model RecurrentAttentionSeq2Seq \
  --model-config "{'hidden_size': 512, 'dropout': 0.2, \
                   'tie_embedding': True, 'transfer_hidden': False, \
                   'encoder': {'num_layers': 3, 'bidirectional': True, 'num_bidirectional': 1, 'context_transform': 512}, \
                   'decoder': {'num_layers': 3, 'concat_attention': True,\
                               'attention': {'mode': 'dot_prod', 'dropout': 0, 'output_transform': True, 'output_nonlinearity': 'relu'}}}" \
  --data-config "{'moses_pretok': True, 'tokenization':'bpe', 'num_symbols':32000, 'shared_vocab':True}" \
  --b 128 \
  --max-length 80 \
  --device-ids 0 \
  --trainer Seq2SeqTrainer \
  --optimization-config "[{'epoch': 0, 'optimizer': 'Adam', 'lr': 1e-3},
                          {'epoch': 6, 'lr': 5e-4},
                          {'epoch': 8, 'lr':1e-4},
                          {'epoch': 10, 'lr': 5e-5},
                          {'epoch': 12, 'lr': 1e-5}]" \

seq2seq.pytorch's People

Contributors

eladhoffer avatar nelson-liu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

seq2seq.pytorch's Issues

How do we use scripts under the train folder

Hi. Please i would like to know how to train train these models. When i ran train_en_de.sh, i get errors:

Traceback (most recent call last):
File "/media/vivien/A/NEW-SMT/seq2seq.pytorch-master/main.py", line 15, in
from seq2seq.tools.utils.log import setup_logging
ModuleNotFoundError: No module named 'seq2seq.tools.utils.log'
Traceback (most recent call last):
File "/media/vivien/A/NEW-SMT/seq2seq.pytorch-master/main.py", line 15, in
from seq2seq.tools.utils.log import setup_logging
ModuleNotFoundError: No module named 'seq2seq.tools.utils.log'
Traceback (most recent call last):
File "/media/vivien/A/NEW-SMT/seq2seq.pytorch-master/main.py", line 15, in
from seq2seq.tools.utils.log import setup_logging
ModuleNotFoundError: No module named 'seq2seq.tools.utils.log'
Traceback (most recent call last):
File "/media/vivien/A/NEW-SMT/seq2seq.pytorch-master/main.py", line 15, in
from seq2seq.tools.utils.log import setup_logging
ModuleNotFoundError: No module named 'seq2seq.tools.utils.log'

ModuleNotFoundError: No module named 'apply_bpe'

Hello, the problem I found is that even after I installed the subword-nmt library, running the code below (in the Generate_Caption.ipynb) still makes an error.
checkpoint = load_url('https://dl.dropboxusercontent.com/s/05dvriaiqk74cum/caption_resnet50-4c0fa803.pth', map_location={'gpu:0':'cpu'})

The error is File "/home/noone/anaconda3/envs/tensorflow_3_180/lib/python3.6/site-packages/torch/serialization.py", line 469, in _load result = unpickler.load() ModuleNotFoundError: No module named 'apply_bpe'

My environment is python 3.6 and I'm using conda.

I don't know why, and hope you could help with that.
Thanks in advance.

ByteNet and Conv

Hi are the ByteNet and Convolutional things here working? have they been tested? seems like since some refactoring they are broken, but i'm wondering if the code there has been tried before and is worth fixing to fit the refactoring?

module 'torch.nn' has no attribute 'MultiheadAttention'

Hi
In the file of modules/attention.py, the class MultiHeadAttention(nn.MultiheadAttention) is reported an error:

class MultiHeadAttention(nn.MultiheadAttention):
AttributeError: module 'torch.nn' has no attribute 'MultiheadAttention'

I use the pytorch=0.4.1. Is there any version of torch that containing the nn.MultiheadAttention?

ImportError: cannot import name 'PermutedSequenceGenerator'

When running main.py, I get the following error:

  File "main.py", line 15, in <module>
    from seq2seq import models, datasets
  File "seq2seq.pytorch\seq2seq\models\__init__.py", line 1, in <module>
    from .transformer import Transformer, TransformerAttentionDecoder, TransformerAttentionEncoder
  File "seq2seq.pytorch\seq2seq\models\transformer.py", line 6, in <module>
    from .seq2seq_base import Seq2Seq
  File "seq2seq.pytorch\seq2seq\models\seq2seq_base.py", line 8, in <module>
    from seq2seq.tools.beam_search import SequenceGenerator, PermutedSequenceGenerator
ImportError: cannot import name 'PermutedSequenceGenerator'

I'm not sure whether it needs to be imported for training to work.
Thanks

ModuleNotFound Error

Hi Elad,

I am using your code for my task. I face an error related to ModuleNotFoundError: No module named 'seq2seq.tools.utils.log'. I already executed script python setup.py develop but after that it showing the same error.

Best
Arunav

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.