sanxing-chen / nmt2017-zh-en Goto Github PK

View Code? Open in Web Editor NEW

39.0 39.0 4.0 28 KB

Pre-processing and training scripts for WMT 2017 ZH-EN translation task

Shell 31.59% Python 68.41%

nmt2017-zh-en's People

Contributors

Stargazers

Watchers

Forkers

konfido fangtao-123 bluezeros

nmt2017-zh-en's Issues

How to reproduce the bleu score in 2 GPU cards?

My env :

2 NVIDA GeForce RTX 2080 Ti
pytorch 1.5.0

Data source : http://www.statmt.org/wmt17/translation-task.html

include "News Commentary v12" and "UN Parallel Corpus V1.0"

Data preprocess follow prepare.sh

Train :

CUDA_VISIBLE_DEVICES=0,1 fairseq-train data-bin/wmt17_en_zh -a transformer --optimizer adam -s en -t zh --label-smoothing 0.1 --dropout 0.3 --max-tokens 4000 --min-lr '1e-09' --lr-scheduler inverse_sqrt --weight-decay 0.0001 --criterion label_smoothed_cross_entropy --max-update 1000000 --warmup-updates 10000 --warmup-init-lr '1e-7' --lr '0.001' --adam-betas '(0.9, 0.98)' --adam-eps '1e-09' --clip-norm 25.0 --keep-last-epochs 10 --save-dir checkpoints_test |& tee -a wmt17_train.test.log

Then I got very bad score ...

2020-07-28 11:22:01 | INFO | fairseq_cli.generate | Generate test with beam=5: BLEU4 = 0.00, 6.4/0.0/0.0/0.0 (BP=0.444, ratio=0.552, syslen=26013, reflen=47155)

Training log is here !

https://drive.google.com/file/d/11l5c8VFH1nmZxjbVhD15U3PbWHBFkCtd/view?usp=sharing

Can you give me some suggestion about this result ?
Thank you !

Reproducibility issue when training on a smaller dataset and fewer GPUs

Hi:

Just want to know How to replicate the result you mentioned on README, The model reaches 20 BLEU on testing dataset, after training for only 2 epochs.

I simple used your setup to train my model, however after 3 epochs, I got

020-06-03 17:49:03 | INFO | fairseq_cli.generate | Generate test with beam = 5: BLEU4 = 0.09, 7.5/0.7/0.0/0.0 (BP=1.000, ratio=1.996, syslen=289332, reflen=144951)

my generate-script is

fairseq-generate data-bin/wmt17_zh_en \
    --path checkpoints/checkpoint_best.pt \
    --batch-size 128 --beam 5 --remove-bpe

and the training data I used are:

training-parallel-nc-v12
United Nations Parallel-enzh

Thank you!

Could you upload dataset files in the first step?

Since we need to download 'all dataset required in the WMT 2017 MT task'
However, the ftp server inside this cwmt-wmt is broken.
Anyone has these files and are willing to share?
Thanks a lot.

Lower performance in alignment compared to another preprocessing script.

Hi Sanxing, thank you for sharing this script!

I run your preprocess.py (clean empty lines; I did not run the whole prepare.sh) and then use fast_align to learn an alignment model on the parallel corpus.
I found that the perplexity of alignmens given by the alignment model is higher than the results of the parallel corpus preprocessed by another script wmt.py.
I guess this is due to that they merge the blank lines.
So could you possibly add this merge blank lines function into your script in the future? Thanks a lot!

sanxing-chen / nmt2017-zh-en Goto Github PK

nmt2017-zh-en's People

Contributors

Stargazers

Watchers

Forkers

nmt2017-zh-en's Issues

How to reproduce the bleu score in 2 GPU cards?

My env :

Data source : http://www.statmt.org/wmt17/translation-task.html

Data preprocess follow prepare.sh

Train :

Then I got very bad score ...

Training log is here !

https://drive.google.com/file/d/11l5c8VFH1nmZxjbVhD15U3PbWHBFkCtd/view?usp=sharing

Reproducibility issue when training on a smaller dataset and fewer GPUs

Could you upload dataset files in the first step?

Lower performance in alignment compared to another preprocessing script.

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent