neulab / xnmt Goto Github PK

eXtensible Neural Machine Translation

License: Other

Python 99.51% Shell 0.19% C++ 0.30%

xnmt's Issues

Ability to set learning rate of trainer

It would be nice to have an option to set the learning rate of the trainer. If the option is not specified, use the default learning rate.

Also, I think we should make Adam the default trainer. SGD takes forever to train.

Make docstring formatting consistent

Currently the format of our docstrings is not consistent. I'd suggest following the format in ResidualLSTMEncoder, as it seems to be the most thoroughly documented. We should also add the documentation style to the README.md or some other coding style document.

Update: To be more clear -- This means "use double quotes for docstrings" and "mark parameters as @param"

Feature Request: Integration Tests

It would be nice if we had integration tests that at least made sure that things didn't break on Python 2 or 3.

Error during decoding

Hi,

I ran a super-small experiment training on the dev set from the example Japanese data (see xnmt-small.yaml in mcds-exp.zip) and got the following error. It looks like this might be because search_strategy is outputting NumPy arrays instead of integers at each timestep. Is anyone else getting this problem?

[dynet] random seed: 852191151
[dynet] allocating memory: 512MB
[dynet] memory allocation done.
=> Running ja_check
   > Training   
   Start training in minibatch mode...   
   Epoch 1.0000: train_ppl=372.9873 (words=5057, time=0-00:00:01)   
   Epoch 1.0000: test_ppl=220.3990 (words=5057, time=0-00:00:02)   
   Epoch 1.0000: best dev loss, writing model to xnmtmodel/dev.mod   
   > Evaluating   
   Traceback (most recent call last):
     File "/home/gneubig/work/xnmt/xnmt/xnmt_run_experiments.py", line 123, in <module>
          xnmt_trainer.input_reader.vocab, xnmt_trainer.output_reader.vocab, xnmt_trainer.translator))
     File "/usr0/home/gneubig/work/xnmt/xnmt/xnmt_decode.py", line 52, in xnmt_decode
          target_sentence = output_generator.process(token_string)[0]
     File "/usr0/home/gneubig/work/xnmt/xnmt/output.py", line 32, in process
          self.token_string.append(self.vocab[token])
     File "/usr0/home/gneubig/work/xnmt/xnmt/vocab.py", line 35, in __getitem__
          return self.i2w[i]
   TypeError   :    only integer scalar arrays can be converted to a scalar index

Standard example fails (on Python3)

Currently the standard example seems to fail on the master branch with Python3:

(python3) gneubig@lor:~/work/xnmt$ python xnmt/xnmt_run_experiments.py examples/standard.yaml 
Traceback (most recent call last):
  File "xnmt/xnmt_run_experiments.py", line 18, in <module>
    import xnmt.xnmt_preproc, xnmt.xnmt_train, xnmt.xnmt_decode, xnmt.xnmt_evaluate
ModuleNotFoundError: No module named 'xnmt'

@msperber @philip30 , have you tested on Python3 and not encountered this error (i.e. it's a problem with my environment) or not tested at all?

Cython setup doesn't work on Mac OS?

It seems that cython is broken on mac OS, e.g.:

(python3) neubig@itachi:~/work/xnmt$ python setup.py build_ext --inplace --use-cython-extensions
running build_ext
building 'xnmt.cython.xnmt_cython' extension
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/neubig/anaconda/envs/python3/include -arch x86_64 -I/Users/neubig/anaconda/envs/python3/include -arch x86_64 -I/Users/neubig/anaconda/envs/python3/include/python3.6m -c xnmt/cython/xnmt_cython.cpp -o build/temp.macosx-10.7-x86_64-3.6/xnmt/cython/xnmt_cython.o -std=c++11
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/neubig/anaconda/envs/python3/include -arch x86_64 -I/Users/neubig/anaconda/envs/python3/include -arch x86_64 -I/Users/neubig/anaconda/envs/python3/include/python3.6m -c xnmt/cython/src/functions.cpp -o build/temp.macosx-10.7-x86_64-3.6/xnmt/cython/src/functions.o -std=c++11
xnmt/cython/src/functions.cpp:3:10: fatal error: 'unordered_map' file not found
#include <unordered_map>
         ^~~~~~~~~~~~~~~
1 error generated.
error: command 'gcc' failed with exit status 1

six and scipy should be added to requirements.txt

As said in title

Create tested "recipes"

It would be nice if xnmt had a list of "recipes" that get competitive results on standard datasets. These would be in contrast to "examples", which do not necessarily have to have this accuracy guarantee, or do not have to be on standard datasets. Some examples would be:

Standard attentional model on WMT2014
Speech-to-text translation on Fisher corpus

Any other interesting things.

Corpus filtering option in xnmt_train, or as a pre-proecessing step

So as we've seen, having super-long sentences in our training set can cause xnmt to run out of memory. I think there are a couple of ways around this:

Create xnmt_preprocess.py, which will perform pre-processing of a corpus. One of the options could be to remove all sentences that are over a certain length.
Change the corpus reading code in xnmt_train.py to allow it to throw away sentences over a certain length when reading in the corpus.

What do you think?

Multi-reference evaluation

It would be nice if we could do evaluation with multiple references, particularly for BLEU.

Experiment Configuration File Format

The documentation Experiment Configuration File Format seems to be outdated.
For example: train_target vs train_trg.

Memory leak when saving model?

Hi guys, I tried using xnmt under graham's advice but I get a problem when running :

python xnmt_run_experiments.py ../test/experiments-config.txt

At the end of each epoch, the memory consumed by the program either augments by ~ 500MB or increases continually until my computer freezes.

I use the cpu version of dynet and my OS is ubuntu 15.10

EDIT : After doing more tests I have more infos

Not sure if this is model saving. It seems to happen exclusively between two epochs
Maybe this is linked to dynet models which are notably bugged? (ie if you delete a model the memory allocated to the parameters is not freed, cf clab/dynet#418)

Also, pic or it didn't happen :

Examples on Standard Datasets

It would be nice if we had examples of how to run xnmt on standard datasets and get great scores, the best scores! I think this could be implemented by restructuring the examples folder to have on large README.md explaining what each of the examples are, then sub-directories with a README.md explaining the various commands that need to be run to obtain the data and train the model. For models that can be run as-is from the top directory of xnmt using the example data, then it's fine to leave them as-is, although a short explanation in the top README.md might be warranted.

Is Minimum Risk Training needed?

I wonder if this will be a good feature?

I also open a discussion about a good design for this training.

Decorators break Python 2

I get an error with the most recent code using Python 2:

Traceback (most recent call last):
  File "xnmt/xnmt_run_experiments.py", line 10, in <module>
    import xnmt_preproc, xnmt_train, xnmt_decode, xnmt_evaluate
  File "/Users/neubig/work/xnmt/xnmt/xnmt_train.py", line 13, in <module>
    from encoder import *
  File "/Users/neubig/work/xnmt/xnmt/encoder.py", line 8, in <module>
    from decorators import recursive
  File "/Users/neubig/work/xnmt/xnmt/decorators.py", line 29
    def rec_f(obj, *args, **kwargs, context=None):
                                      ^
 SyntaxError: invalid syntax

Any ideas @philip30 ?

Better Support for Sharing Components, Multi-task Learning

Currently it is very difficult to implement multi-task learning in xnmt. This could probably be fixed by doing a few things:

Making it possible to define something like CompoundModel, which can contain a Translator and a Retriever, two Translators, etc.
Making it easier to "reference" previously defined model components. For example, Translator number two might "reference" the Encoder of Translator number 1. This would allow them to share a single encoder and train them in a multi-task fashion.
Come up with a new TrainingTask interface that references a model and its training data and parameters, a DecodingTask that performs decoding, and a EvaluationTask that performs evaluation for the various tasks.

This would be a large refactoring of the code, but could potentially make things much more flexible, so it would potentially be nice to have.

CustomCompactLSTMBuilder is super-slow

This change caused a 20-fold accuracy drop on GPU (confirmed by both me and @philip30):
e75b548#diff-d5c251b1ac3d0ca2da44bf734613b4c9L86

Could you take a look and fix/revert it?

standard example seems broken

Hi,

First of all, as a dynet/nmt fan, this project is very exciting!

To the issue: I tried running the standard example from the documentation using:

python xnmt/xnmt_run_experiments.py examples/standard.yaml

And got the following output:

[dynet] random seed: 2045434078
[dynet] allocating memory: 512MB
[dynet] memory allocation done.
Traceback (most recent call last):
  File "xnmt/xnmt_run_experiments.py", line 108, in <module>
    config = config_parser.args_from_config_file(args.experiments_file)
  File "/home/nlp/aharonr6/git/xnmt/xnmt/options.py", line 105, in args_from_config_file
    {name: self.check_and_convert(task_name, name, value) for name, value in exp_task_values.items()})
  File "/home/nlp/aharonr6/git/xnmt/xnmt/options.py", line 105, in <dictcomp>
    {name: self.check_and_convert(task_name, name, value) for name, value in exp_task_values.items()})
  File "/home/nlp/aharonr6/git/xnmt/xnmt/options.py", line 42, in check_and_convert
    raise RuntimeError("Unknown option {} for task {}".format(option_name, task_name))
RuntimeError: Unknown option encoder_layers for task train

Is the example broken or is it me doing something wrong?
Thanks!

One Sentence One Model

One of these would be nice to have implemented:

https://arxiv.org/pdf/1609.06490.pdf
http://aclweb.org/anthology/W/W17/W17-4713.pdf

Implement External Evaluator

It would be nice to be able to call an external evaluation program. cdec has a very nice interface for doing so:

https://github.com/redpony/cdec/tree/master/mteval

It would be cool if we had something like this.

Error while loading the pre-trained model

initialized BilingualTrainingCorpus({'dev_src': '/projects/tir2/users/sjpadman/temp_data/bilingual_dev_src.txt', 'dev_trg': '/projects/tir2/users/sjpadman/temp_data/bilingual_dev_tar.txt', 'train_src': '/projects/tir2/users/sjpadman/temp_data/bilingual_train_src.txt', 'train_trg': '/projects/tir2/users/sjpadman/temp_data/bilingual_train_tar.txt'})
   Traceback (most recent call last):
     File "xnmt/xnmt_run_experiments.py", line 166, in <module>
          sys.exit(main())
     File "xnmt/xnmt_run_experiments.py", line 120, in main
          xnmt_trainer = xnmt.xnmt_train.XnmtTrainer(train_args)
     File "/projects/tir1/users/sjpadman/xnmt/xnmt/xnmt_train.py", line 101, in __init__
          self.load_corpus_and_model()
     File "/projects/tir1/users/sjpadman/xnmt/xnmt/xnmt_train.py", line 162, in load_corpus_and_model
          self.corpus_parser = self.model_serializer.initialize_object(corpus_parser) if self.need_deserialization else self.args.corpus_parser
     File "/projects/tir1/users/sjpadman/xnmt/xnmt/serializer.py", line 54, in initialize_object
          return self.init_components_bottom_up(deserialized_yaml, deserialized_yaml.dependent_init_params(), context=context)
     File "/projects/tir1/users/sjpadman/xnmt/xnmt/serializer.py", line 139, in init_components_bottom_up
          init_params[init_arg] = self.init_components_bottom_up(val, sub_dependent_init_params, context)
     File "/projects/tir1/users/sjpadman/xnmt/xnmt/serializer.py", line 139, in init_components_bottom_up
          init_params[init_arg] = self.init_components_bottom_up(val, sub_dependent_init_params, context)
     File "/projects/tir1/users/sjpadman/xnmt/xnmt/serializer.py", line 158, in init_components_bottom_up
          print("initialized %s(%s)" % (obj.__class__.__name__, init_params))
     File "/projects/tir1/users/sjpadman/xnmt/xnmt/tee.py", line 40, in write
          self.stdstream.write(" " * self.indent + data)
   UnicodeEncodeError   :    'ascii' codec can't encode character '\xe1' in position 99: ordinal not in range(128)

The above error is thrown while trying to load a pre-trained model.

Configuration Files shouldn't be copied if not provided

I think the copied yaml configuration shouldn't be copied if the yaml_file config is not specified.
I just think it is weird to copy the configuration to the directory where scripts are being run.

For example, if you run the test of XNMT, it will copy all the test/config/*.yaml to the root of xnmt.

Need directions about how to run experiments

Adding a simple example to the README would be nice.

readthedocs

It'd be nice if we could have the documentation in an easy-to-read format like readthedocs: http://readthedocs.org

Length normalization broken when evaluating on dev set

See #201 for details.

Feature Request: Tokenization

It would be nice to be able to perform tokenization/detokenization as part of the preprocessing capability: #104

Options include BPE, sentencepiece, or manual tokenization like the Moses tokenizer. For ease of implementation, particularly for sentencepiece, I think it's OK to assume a call to an external program when implementing these.

Two issues with dev set evaluation when doing minibatching

Hi @CharlotteKay , I have two questions about evaluation when using minibatching.

First, it looks like the number of words evaluated in the dev set is inconsistent when using minibatching or not. Here is without:

[dynet] random seed: 3841206789
[dynet] allocating memory: 512MB
[dynet] memory allocation done.
Start training in non-minibatch mode...
0.01 Dev perplexity: 616.9578868143398 (32490.217478 over 5057 words)
0.02 Dev perplexity: 468.18571355561767 (31094.810513 over 5057 words)
0.03 Dev perplexity: 428.57042830985233 (30647.721363 over 5057 words)
0.04 Dev perplexity: 495.58422153685274 (31382.413588 over 5057 words)

and here is with (32 sentences):

[dynet] random seed: 811370858
[dynet] allocating memory: 512MB
[dynet] memory allocation done.
Start training in minibatch mode...
0.33557046979865773 Dev perplexity: 347.5941161824009 (13638.763672 over 2331 words)
0.6711409395973155 Dev perplexity: 311.98835044678697 (13386.853394 over 2331 words)

Second, I think we should probably evaluate after the same number of sentences regardless of the minibatch size. Now we are evaluating every 100*minibatch_size sentences, but let's set add eval_every setting that is specified by sentences, and then evaluate every eval_every sentences.

Feature Request: "Attention is All You Need"

This is a very interesting paper which shouldn't be super-difficult to implement, and would also test the extensibility of xnmt: https://arxiv.org/abs/1706.03762

If someone is interested I think this would be a great example to have.

preproc.yaml should include SentencePiece detokenization

I believe this is missing detokenization when evaluating BLEU score: https://github.com/neulab/xnmt/blob/master/examples/preproc.yaml

Multi-dataset Evaluation

It would be nice if we could evaluate on multiple test sets. This could be done in one of two ways:

Pass in multiple files
Pass in a single file, but specify a range of lines that correspond to each different set

I prefer the first, but the second might be OK as well.

Create over-arching layer size option

Currently xnmt has a bunch of different places to specify the size of the embeddings, encoder, decoder, etc. I think it would be helpful to make it possible to specify a default layer size that is used in all places, unless something else is specified explicitly.

Documentation of expected input/output of each function call for the upper-level abstract classes

I think we should document the type of the expected input/output of each function call of the upper-level abstract classes to make sure that our API is clear. Here are the classes that need documentation:

Print training speed in words/sec

Just for convenience, could we print the number of words processed per second every time we print logging information? This could be done for the training and dev sets.

Cease Python 2 Support?

This is not an issue so much, but I'm thinking that we can stop supporting Python 2, like NumPy: https://github.com/numpy/numpy/blob/master/doc/neps/dropping-python2.7-proposal.rst

If there are no objections, I will release the requirement on supporting Python 2 (maybe next week?).

Decoder is not initialized to final encoder state

It is important to initialize the decoder with the final state of the encoder, but we are not doing this well. We should fix this:
https://github.com/neulab/xnmt/blob/master/xnmt/translator.py#L91

Specify experiment to run by name

Given an experiment file, it might be nice to be able to specify which experiments to run via a command line option. If the command line option was not specified, we could revert to the current behavior of running all of them.

WordTrgSrcBatcher is not working in the current master branch.

After merging master branch to my current working branch I found that there is a bug in the current implementation of WordTrgSrcBatcher. It is just broken and does not work. Has it been tested yet btw?

masking not implemented correctly with initialization in lstm.py

in lstm.py, the customLSTMbuilder uses the class LSTMState. When you call add_input, you just pass in previous state. However, if you call a customLSTMbuilder with initial_state, it returns a LSTM state with some initialized c and h value. But this initial state does not have a previous_state property. So when you call add_input to it, it would not pass in the initialized c and h, and this initialization information would be lost forever?

Report fine-grained statistics for BLEU

It would be nice if BLEU could also report fine-grained statistics similar to the following (from the mt-evaluator program of my travatar toolkit)

e.g.: BLEU = 0.56557, 0.82951/0.757936/0.71831/0.68288 (BP=0.758942, ratio=0.783803, hyp_len=5449, ref_len=6952)

This gives you the precision of each n-gram, the brevity penalty, and the overall length compared to the reference. This is really useful in debugging, as sometimes we're getting a low BLEU score just because our method is outputting hypotheses that are too short.

Ability to Check if Decoding Matches Loss Calculation

It would be nice if we had a testing setup that allowed us to check if the score calculated during decoding matched the score by calc_loss. This would greatly help with debugging one of the most common errors when implementing models (train-test differences).

Feature Request: Unknown Word Replacement

It would be nice to have options for unknown word replacement, either with the original word or using a lexicon.

Documentation link is wrong

The readme points to the badge instead of the documentation:

information can be found in the documentation.

It should point to http://xnmt.readthedocs.io/en/latest/.

Feature Request: More Verbose Error Messages on Malformed Config File

Currently when we get a poorly formed config file (missing a necessary argument, etc.), the message is quite difficult to understand. It'd be nice if it pointed out exactly what the problem was.

PolynomialNormalization missing attribute

c6fac72 seems to have introduced a bug with PolynomialNormalization; can be reproduced by running the standard.yaml config file in examples/.

Call:
python3 xnmt/xnmt_run_experiments.py examples/standard.yaml --dynet-gpu

Produces this output (truncated):

> Training
   Epoch 0.1002: train_loss/word=7.026294 (words=9426, words/sec=1150.65, time=0-00:00:08)
   Epoch 0.2000: train_loss/word=6.703589 (words=18864, words/sec=1155.63, time=0-00:00:16)
   Epoch 0.3003: train_loss/word=6.518545 (words=28110, words/sec=1188.03, time=0-00:00:24)
   Epoch 0.4005: train_loss/word=6.392840 (words=37588, words/sec=1138.11, time=0-00:00:32)
   Epoch 0.5003: train_loss/word=6.307682 (words=46648, words/sec=1159.72, time=0-00:00:40)
   Epoch 0.6004: train_loss/word=6.225853 (words=55790, words/sec=1144.99, time=0-00:00:48)
   Epoch 0.7000: train_loss/word=6.154869 (words=65143, words/sec=1154.20, time=0-00:00:56)
   Epoch 0.8003: train_loss/word=6.088884 (words=74466, words/sec=1143.59, time=0-00:01:04)
   Epoch 0.9000: train_loss/word=6.024372 (words=83640, words/sec=1188.11, time=0-00:01:12)
   Epoch 1.0000: train_loss/word=5.964533 (words=93086, words/sec=1154.51, time=0-00:01:20)
   Traceback (most recent call last):
     File "xnmt/xnmt_run_experiments.py", line 154, in <module>
          sys.exit(main())
     File "xnmt/xnmt_run_experiments.py", line 118, in main
          training_regimen.run_epochs(exp_args["run_for_epochs"])
     File "/home/ziyux/installs/miniconda3/envs/dynet/lib/python3.6/site-packages/xnmt-0.0.1-py3.6.egg/xnmt/train.py", line 190, in run_epochs
          self.one_epoch()
     File "/home/ziyux/installs/miniconda3/envs/dynet/lib/python3.6/site-packages/xnmt-0.0.1-py3.6.egg/xnmt/train.py", line 241, in one_epoch
          self.dev_evaluation()
     File "/home/ziyux/installs/miniconda3/envs/dynet/lib/python3.6/site-packages/xnmt-0.0.1-py3.6.egg/xnmt/train.py", line 260, in dev_evaluation
          xnmt.xnmt_decode.xnmt_decode(model_elements=(self.corpus_parser, self.model), **self.decode_args)
     File "/home/ziyux/installs/miniconda3/envs/dynet/lib/python3.6/site-packages/xnmt-0.0.1-py3.6.egg/xnmt/xnmt_decode.py", line 111, in xnmt_decode
          output = generator.generate_output(src, i, forced_trg_ids=ref_ids)
     File "/home/ziyux/installs/miniconda3/envs/dynet/lib/python3.6/site-packages/xnmt-0.0.1-py3.6.egg/xnmt/generator.py", line 6, in generate_output
          generation_output = self.generate(*args, **kwargs)
     File "/home/ziyux/installs/miniconda3/envs/dynet/lib/python3.6/site-packages/xnmt-0.0.1-py3.6.egg/xnmt/translator.py", line 127, in generate
          output_actions, score = self.search_strategy.generate_output(self.decoder, self.attender, self.trg_embedder, dec_state, src_length=len(sents), forced_trg_ids=forced_trg_ids)
     File "/home/ziyux/installs/miniconda3/envs/dynet/lib/python3.6/site-packages/xnmt-0.0.1-py3.6.egg/xnmt/search_strategy.py", line 97, in generate_output
          new_set.append(self.Hypothesis(self.len_norm.normalize_partial(hyp.score, score[cur_id], len(new_list)),
     File "/home/ziyux/installs/miniconda3/envs/dynet/lib/python3.6/site-packages/xnmt-0.0.1-py3.6.egg/xnmt/length_normalization.py", line 73, in normalize_partial
          return (score_so_far * pow(new_len-1, self.m) + score_to_add) / pow(new_len, self.m)
   AttributeError   :    'PolynomialNormalization' object has no attribute 'm'

Travis CI is failing

Travis CI checks are failing on the pip install of DyNet. I'm guessing that this is something simple like not installing mercurial or the compile environment beforehand, and might be resolved by adding packages to the .travis.yml file: http://dynet.readthedocs.io/en/latest/python.html

@philip30 if you have time today perhaps you could take a look? If not, I'll try to take a look later.

Feature Request: Ensembling

It would be nice to be able to decode with an ensemble of multiple models.

Loss Calculator

Specifying the loss_calculator within the model seems like a good idea but it is binding the model to a specific training process as we get the entire model specifications from the pre-trained model. This prevents us from choosing to fine-tune a model with a training process other than it was initially trained with.
Should this be made more flexible? Or am I missing something?

Terminology Confusing

Currently, some key terms are reused for different concepts:

model_globals.params (global hyperparams + dynet weights)
model_globals.params.model (dynet weights)
model in the YAML config (top of the model hierarchy, e.g. translator or retriever)
ModelParams: container for serialization, contains YAML model, corpus parser, global_params

Residual network serialization fails

When saving a model that contains a residual encoder (e.g. a ResidualLSTMEncoder), save_to_file fails with "Class LookupParameters is not serializable. Try adding serialize_params to it."

However it seems that the model_lookup field in ResidualLSTMEncoder (the one that's causing the issue) is never used anywhere in the code (since lookup can be performed directly from the embeddings field of the embedder). Just deleting that field makes model serialization work. I just wanted to confirm that my understanding was correct and that the field can safely be removed.

Implement Copy Mechanism

It would be nice to have this paper implemented:
https://arxiv.org/abs/1603.06393

neulab / xnmt Goto Github PK

xnmt's Issues

Recommend Projects

Recommend Topics

Recommend Org