Code Monkey home page Code Monkey logo

Comments (6)

philip30 avatar philip30 commented on September 4, 2024

Hi, can you copy paste your yaml file here? I can try to take a look

from xnmt.

tenghaha avatar tenghaha commented on September 4, 2024

@philip30 I'm using the defalut configuration file provided by the article
http://www.msperber.com/research/self-att/index.html
but I changed tag TransformerSeqTransducer to SAAMSeqTransducer, and ContVecReader to NpzReader

# purely self-att model, using concatenated 'trigonometric' positional encodings
exp029a.1: !Experiment
  exp_global: !ExpGlobal
    dropout: 0.2
    default_layer_dim: 512
    placeholders:
      DATA_DIR: (removed)
  model: !DefaultTranslator
    src_embedder: !NoopEmbedder
      emb_dim: 40
    encoder: !ModularSeqTransducer
      modules:
      - !SAAMSeqTransducer
        layers: 2
        input_dim: 40
        hidden_dim: 512
        downsample_factor: 2
        ff_hidden_dim: 512
        pos_encoding_type: trigonometric
        pos_encoding_combine: concat
        pos_encoding_size: 40
        max_len: 1500
      - !SAAMSeqTransducer
        layers: 1
        input_dim: 512
        hidden_dim: 512
        downsample_factor: 1
        ff_hidden_dim: 512
    attender: !MlpAttender
      hidden_dim: 128
    trg_embedder: !SimpleWordEmbedder
      emb_dim: 64
      word_dropout: 0.1
      fix_norm: 1
    decoder: !MlpSoftmaxDecoder
      layers: 1
      lstm_dim: 512
      input_feeding: True
      bridge: !NoBridge {}
    src_reader: !NpzReader
      transpose: true
      feat_from: 0
      feat_to: 40
    trg_reader: !PlainTextReader
      vocab: !Vocab
        vocab_file: '{DATA_DIR}/train/src.char.vocab.manual'
  train: !SimpleTrainingRegimen
    src_file: '{DATA_DIR}/train/src.contvec.npz'
    trg_file: '{DATA_DIR}/train/src.char'
    max_src_len: &max_src_len 1500
    max_trg_len: 350
    run_for_epochs: 500
    batcher: !WordSrcBatcher
      avg_batch_size: 18
      pad_src_to_multiple: 4
      src_pad_token: ~
    trainer: !AdamTrainer
      alpha: 0.0003
    lr_decay: 0.5
    lr_decay_times: 3
    patience: 5
    initial_patience: 10
    dev_every: 0
    restart_trainer: True
    dev_tasks:
      - !AccuracyEvalTask
        eval_metrics: wer,cer
        src_file: &dev_src '{DATA_DIR}/dev/src.contvec.npz'
        ref_file: '{DATA_DIR}/dev/src.txt'
        hyp_file: '{EXP_DIR}/models/{EXP}.dev_hyp'
        inference: !SimpleInference
          post_process: join-char
          search_strategy: !BeamSearch
            beam_size: 20
            max_len: 500
            len_norm: !PolynomialNormalization
              apply_during_search: true
              m: 1.5
      - !LossEvalTask
        src_file: *dev_src
        ref_file: '{DATA_DIR}/dev/src.char'
  evaluate:
    - !AccuracyEvalTask
      eval_metrics: wer,cer
      src_file: &test_src '{DATA_DIR}/test/src.contvec.npz'
      ref_file: '{DATA_DIR}/test/src.txt'
      hyp_file: '{EXP_DIR}/models/{EXP}.test_hyp'
      inference: !SimpleInference
        post_process: join-char
        search_strategy: !BeamSearch
          beam_size: 20
          max_len: 500
          len_norm: !PolynomialNormalization
            apply_during_search: true
            m: 1.5
    - !LossEvalTask
      src_file: *test_src
      ref_file: '{DATA_DIR}/test/src.char'

from xnmt.

philip30 avatar philip30 commented on September 4, 2024

Hmm this is very strange. Which revision of the xnmt are you using? I will try to reproduce the issue

from xnmt.

tenghaha avatar tenghaha commented on September 4, 2024

XNMT: 0.1.0 release (revision 0d094f5)
(I just tried newst revision a87e7b9 and the error is the same)

Python: 3.6.4 (I tried 3.6.8, same error - - )
OS: Ubuntu 16.04

from xnmt.

tenghaha avatar tenghaha commented on September 4, 2024

@philip30 I think I found the key.
When I set argument 'ff_lstm' to 'true', it works normally.
I'm trying to find out the reason.

from xnmt.

philip30 avatar philip30 commented on September 4, 2024

Thanks! I am closing this issue.

from xnmt.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.