Code Monkey home page Code Monkey logo

bi-gru-translator's Introduction

Auther: Liwei Song

Project name: Bi-GRU Machine translator (French to English) with attention.

Environment: Python 3.6 with pytorch installed, ubuntu 16.04 (or compatiable version on cloud)


The translation program could be launched by the followin command in shell:

python tranlator.py

Process:

Data preparation--->Build RNN Encoder/Decoder--->Add attention for decoder---->Try Bidirection------>Training model------>Translation coder---->Evaluation with BLEU

Dataset

The dataset is bi-language subtitle, originally from opensubtitles.org, and opus.nlpl.eu convert the subtitles into parallel corpus.
Orginal Data format:tmx
Data size:202180 pairs of subtitles in the original data set. 131690 pairs are kept (with words appear more than 3 in these corpus)
Training sample size: 100000 pairs , Validation sample size: 22000 pairs, Test sample size:9690 pairs.
Cleaned data: saved ./data folder in txt format--en.txt & fr.txt.

Model Training


Language model: As I am focusing on the seq-to-seq language models, two Gated recurrent networks are used as encoder and decoders separately. Due to time limiation, only bi-rnn with 2 layers is tested.(50000 epoches scheduled, 15000 finished)

Optimization methods: Minibatch gradient descent is used to estimate the optimal solution, and backprogration is used to estimate the gradient descent for the object function given the minibatch data points.

The model is trained on NYU hpc cloud with paramenter set up in run2.sh.
However, it could be locally trained with cpu( which is recommended for no-cuda devices).

Command:    ./run2.sh for cloud   
            python train_cloud2.py

Files clarification:

tranlator.py: main program
train_model: define encoder/decoder/attention class as well as evalation functions.
batch.py : genrate minibatch during training
Text_preprocessing_cloud.py: text preprocessing file and define language class for convient word embedding.
masked_cross_entropy:define cross entropy error for objective function
./data/model2-update-decoder.pth saved decoder
./data/model2-update-encoder.pth saved encoder

To do list:

Calculate Bleu Score for the validation data set.
Compare other rnn unit combinations: GRU/LSTM, different parameters. Finished 50000 epoches or kill it when the train error converges.

Refernce list:

Effective Approaches to Attention-based Neural Machine Translation
https://arxiv.org/abs/1508.04025
for attention model
https://github.com/spro/practical-pytorch/blob/master/seq2seq-translation/seq2seq-translation.ipynb
seq to seq tutorial for reference and basis for my translation machine

bi-gru-translator's People

Contributors

ls4408 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.