Code Monkey home page Code Monkey logo

neural-transducer's People

Contributors

jkodner05 avatar pre-commit-ci[bot] avatar shijie-wu avatar slowwavesleep avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

neural-transducer's Issues

encode() missing 1 required positional argument: 'src_mask' - Beam Decode

Hi, Thank you for making the code open-source!

I am trying to train a g2p based model with beam-decoding. Unfortunately, I am getting the following error. Please refer to the logs below for complete details.

FYI, the code works fine with greedy decoding. Kindly advice.

(base) [aagarwal@ip-0A000427 neural-transducer]$ python src/train.py --train data/100hrs-youtube.train --dev data/100hrs-youtube.dev --test data/100hrs-youtube.test --epochs 100 --dataset g2p --arch transformer --model models/v2-beam-search-decoding/v2 --decode beam
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: seed - 0
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: train - ['data/100hrs-youtube.train']
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: dev - ['data/100hrs-youtube.dev']
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: test - ['data/100hrs-youtube.test']
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: model - 'models/v2-beam-search-decoding/v2'
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: load - ''
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: bs - 20
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: epochs - 100
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: max_steps - 0
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: warmup_steps - 4000
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: total_eval - -1
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: optimizer - <Optimizer.adam: 'adam'>
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: scheduler - <Scheduler.reducewhenstuck: 'reducewhenstuck'>
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: lr - 0.001
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: min_lr - 1e-05
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: momentum - 0.9
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: beta1 - 0.9
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: beta2 - 0.999
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: estop - 1e-08
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: cooldown - 0
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: patience - 0
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: discount_factor - 0.5
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: max_norm - 0
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: gpuid - []
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: loglevel - 'info'
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: saveall - False
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: shuffle - False
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: cleanup_anyway - False
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: dataset - <Data.g2p: 'g2p'>
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: max_seq_len - 128
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: max_decode_len - 128
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: init - ''
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: dropout - 0.2
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: embed_dim - 100
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: nb_heads - 4
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: src_layer - 1
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: trg_layer - 1
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: src_hs - 200
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: trg_hs - 200
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: label_smooth - 0.0
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: tie_trg_embed - False
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: arch - <Arch.transformer: 'transformer'>
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: nb_sample - 2
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: wid_siz - 11
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: indtag - False
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: decode - <Decode.beam: 'beam'>
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: mono - False
INFO - 10/18/20 14:31:37 - 0:00:00 - command line argument: bestacc - False
INFO - 10/18/20 14:31:37 - 0:00:00 - src vocab size 45
INFO - 10/18/20 14:31:37 - 0:00:00 - trg vocab size 44
INFO - 10/18/20 14:31:37 - 0:00:00 - src vocab ['<PAD>', '<s>', '<\\s>', '<UNK>', '"b', '"g', '"h', '"i', '"j', '"k', '"m', '"n', '"s', '"z', "'", 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'ß', 'ä', 'ö', 'ü']
INFO - 10/18/20 14:31:37 - 0:00:00 - trg vocab ['<PAD>', '<s>', '<\\s>', '<UNK>', "'", ',"', '-', '.', '\\', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '¨', 'ß', 'ä', 'ç', 'è', 'é', 'ö', 'ü', 'ș']
INFO - 10/18/20 14:31:37 - 0:00:00 - model: Transformer(
                                       (src_embed): Embedding(45, 100, padding_idx=0)
                                       (trg_embed): Embedding(44, 100, padding_idx=0)
                                       (position_embed): SinusoidalPositionalEmbedding()
                                       (encoder): TransformerEncoder(
                                         (layers): ModuleList(
                                           (0): TransformerEncoderLayer(
                                             (self_attn): MultiheadAttention(
                                               (out_proj): _LinearWithBias(in_features=100, out_features=100, bias=True)
                                             )
                                             (linear1): Linear(in_features=100, out_features=200, bias=True)
                                             (dropout): Dropout(p=0.2, inplace=False)
                                             (linear2): Linear(in_features=200, out_features=100, bias=True)
                                             (norm1): LayerNorm((100,), eps=1e-05, elementwise_affine=True)
                                             (norm2): LayerNorm((100,), eps=1e-05, elementwise_affine=True)
                                             (activation_dropout): Dropout(p=0.2, inplace=False)
                                           )
                                         )
                                         (norm): LayerNorm((100,), eps=1e-05, elementwise_affine=True)
                                       )
                                       (decoder): TransformerDecoder(
                                         (layers): ModuleList(
                                           (0): TransformerDecoderLayer(
                                             (self_attn): MultiheadAttention(
                                               (out_proj): _LinearWithBias(in_features=100, out_features=100, bias=True)
                                             )
                                             (multihead_attn): MultiheadAttention(
                                               (out_proj): _LinearWithBias(in_features=100, out_features=100, bias=True)
                                             )
                                             (linear1): Linear(in_features=100, out_features=200, bias=True)
                                             (dropout): Dropout(p=0.2, inplace=False)
                                             (linear2): Linear(in_features=200, out_features=100, bias=True)
                                             (norm1): LayerNorm((100,), eps=1e-05, elementwise_affine=True)
                                             (norm2): LayerNorm((100,), eps=1e-05, elementwise_affine=True)
                                             (norm3): LayerNorm((100,), eps=1e-05, elementwise_affine=True)
                                             (activation_dropout): Dropout(p=0.2, inplace=False)
                                           )
                                         )
                                         (norm): LayerNorm((100,), eps=1e-05, elementwise_affine=True)
                                       )
                                       (final_out): Linear(in_features=100, out_features=44, bias=True)
                                       (dropout): Dropout(p=0.2, inplace=False)
                                     )
INFO - 10/18/20 14:31:37 - 0:00:00 - number of parameter 216544
INFO - 10/18/20 14:31:37 - 0:00:00 - maximum training 269700 steps (100 epochs)
INFO - 10/18/20 14:31:37 - 0:00:00 - evaluate every 1 epochs
INFO - 10/18/20 14:31:37 - 0:00:00 - At 0-th epoch with lr 0.001000.
100%|| 2697/2697 [01:10<00:00, 38.40it/s]
INFO - 10/18/20 14:32:47 - 0:01:11 - Running average train loss is 1.5452647511058266 at epoch 0
INFO - 10/18/20 14:32:47 - 0:01:11 - At 1-th epoch with lr 0.001000.
100%|| 2697/2697 [01:06<00:00, 40.65it/s]
INFO - 10/18/20 14:33:54 - 0:02:17 - Running average train loss is 1.218658867061779 at epoch 1
100%|| 338/338 [00:02<00:00, 128.70it/s]
INFO - 10/18/20 14:33:56 - 0:02:19 - Average dev loss is 0.9772854196073035 at epoch 1
  0%|| 0/6741 [00:00<?, ?it/s]
Exception ignored in: <generator object StandardG2P.read_file at 0x2af3a8d8b3d0>
RuntimeError: generator ignored GeneratorExit
Traceback (most recent call last):
  File "src/train.py", line 350, in <module>
    main()
  File "src/train.py", line 346, in main
    trainer.run(start_epoch, decode_fn=decode_fn)
  File "/share/pretzel1/exp1/aagarwal/neural-transducer/src/trainer.py", line 373, in run
    eval_res = self.evaluate(DEV, epoch_idx, decode_fn)
  File "src/train.py", line 255, in evaluate
    decode_fn)
  File "/share/pretzel1/exp1/aagarwal/neural-transducer/src/util.py", line 194, in evaluate_all
    pred, _ = decode_fn(model, src)
  File "/share/pretzel1/exp1/aagarwal/neural-transducer/src/decoding.py", line 64, in __call__
    trg_eos=self.trg_eos)
  File "/share/pretzel1/exp1/aagarwal/neural-transducer/src/decoding.py", line 364, in decode_beam_search
    enc_hs = transducer.encode(src_sentence)
TypeError: encode() missing 1 required positional argument: 'src_mask'

Data set for g2p

Could you provide the dataset you used in g2p experiments? I am wondering how you split the dictionary into training, dev and test sets, which is helpful to compare the performance using different models

Strange results running the ACL2019 irregularity example code

hi! I've been playing around with the repo, the code is very nicely organized. but I have a question: I've run the code in example/irregularity-acl19 exactly as shown in the README, and I'm confused by the numbers I'm getting. I ran it on the English UniMorph data following the README, and also on German UniMorph as I'm working with German right now.

according to the README, the output (i.e. in model/unimorph/large/monotag-hmm/{lang}-{fold}.decode.{split}.tsv) contains p(inflected form|lemma, tags) / len(inflected form). I assume this is in the loss column in the TSVs, as that's the only column that makes sense.

here's the distribution of values I get in that column. N is number of predicted forms overall, across all folds and dev/test splits, and N(p > 1) is the number of predicted forms where the listed value for p(inflected form|lemma, tags) / len(inflected form) > 1. the results are split by whether the model correctly predicted the target form.

Lang Prediction correct? N N (p > 1) mean(p) min(p) max(p)
ENG Yes 95861 0 0.0023 1e7 0.2833
No 5437 1524 0.9874 5e2 22.3135
DEU Yes 318311 0 0.0027 1e7 0.2891
No 28423 7770 0.8299 5e2 23.2277

the main thing that confuses me is that the model systematically assigns higher probabilities to forms it gets wrong. (it also looks like there might a bug somewhere if 28% of the incorrectly predicted forms in each language are assigned a probability greater than one.)

going by the paper, the degree of irregularity metric i should be calculated as -log(p / (1 - p)). applying that to the results above, the average i for words the model got right is 9.6 (ENG) and 11.0 (DEU), while for words it predicted wrong ( excluding forms with p > 1 where i is undefined), the average i is 0.7 (ENG, DEU).

this seems completely at odds from the analysis described in the paper. I'm wondering if I've misunderstood something, or ran the example wrong? any ideas what's going on here?

Writing out final dev/test predictions to file?

Hi,

I want to analyse the actual predictions of a trained model (i.e., the word forms themselves), so I need to save them to a file somehow. Is there an easy way to do that? Sorry, if it's something obvious, but I can't find it at a glance.

Error for creating conda env

Hi Shijie,

I got the following error:

C:\research\neural-transducer-master>conda env create --file environment.yml
Collecting package metadata (repodata.json): done
Solving environment: failed

ResolvePackageNotFound:
  - zlib==1.2.11=h516909a_1010
  - mkl_random==1.2.1=py39ha9443f7_2
  - mkl-service==2.3.0=py39h27cfd23_1
  - virtualenv==20.4.4=py39hf3d152e_0
  - ld_impl_linux-64==2.35.1=hea4e1c9_2
  - cffi==1.14.5=py39he32792d_0
  - libstdcxx-ng==9.3.0=h6de172a_19
  - readline==8.1=h46c0cb4_0
  - _libgcc_mutex==0.1=conda_forge
  - libgcc-ng==9.3.0=h2828fa1_19
  - libuv==1.41.0=h7f98852_0
  - tbb==2021.2.0=h4bd325d_0
  - python==3.9.2=hffdb5ce_0_cpython
  - pre-commit==2.12.1=py39hf3d152e_0
  - pytorch==1.8.1=py3.9_cuda11.1_cudnn8.0.5_0
  - ncurses==6.2=h58526e2_4
  - mkl==2021.2.0=h726a3e6_389
  - ninja==1.10.2=h4bd325d_0
  - _openmp_mutex==4.5=1_llvm
  - ca-certificates==2020.12.5=ha878542_0
  - numpy-base==1.20.1=py39h7d8b39e_0
  - pyyaml==5.4.1=py39h3811e60_0
  - cudatoolkit==11.1.1=h6406543_8
  - mkl_fft==1.3.0=py39h42c9631_2
  - yaml==0.2.5=h516909a_0
  - llvm-openmp==11.1.0=h4bd325d_1
  - sqlite==3.35.5=h74cdb3f_0
  - libffi==3.3=h58526e2_2
  - editdistance-s==1.0.0=py39h1a9c180_1
  - tk==8.6.10=h21135ba_1
  - jedi==0.18.0=py39hf3d152e_2
  - xz==5.2.5=h516909a_1
  - certifi==2020.12.5=py39hf3d152e_1
  - numpy==1.20.1=py39h93e21f0_0
  - setuptools==49.6.0=py39hf3d152e_3
  - ipython==7.22.0=py39hef51801_0
  - openssl==1.1.1k=h7f98852_0

It would be really helpful if you give me some hints to solve this issue. :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.