Code Monkey home page Code Monkey logo

Comments (5)

trenous avatar trenous commented on July 30, 2024

Hello Zachary, I was not able to reproduce your bug. Can you verify that you are using the newest version of the repository? If you already do or that does not solve your problem, can you provide the trained model for us to analyze what is the problem?

Best

from openkiwi.

Zachary-YL avatar Zachary-YL commented on July 30, 2024

Thank you for your reply.
Here is the trained Estimator model.

https://www.dropbox.com/s/ce9akwcvhs4tcbo/best_model.torch?dl=0

My config file for training Estimator model, and my config file for predicting.

train_estimator_yaml.txt
predict_estimator_yaml.txt

It's worth noting that I trained Estimator model with a CPU, because there will be errors when using GPU training:

2019-04-24 04:03:26.789 [root setup:380] This is run ID: 62e6dc469e3a4971bbce19bc119487c5
2019-04-24 04:03:26.790 [root setup:383] Inside experiment ID: 0 (None)
2019-04-24 04:03:26.790 [root setup:386] Local output directory is: runs/0/62e6dc469e3a4971bbce19bc119487c5
2019-04-24 04:03:26.790 [root setup:389] Logging execution to MLflow at: None
2019-04-24 04:03:26.872 [root setup:395] Using GPU: 0
2019-04-24 04:03:26.873 [root setup:400] Artifacts location: None
2019-04-24 04:03:26.886 [kiwi.lib.train run:154] Training the PredEst (Predictor-Estimator) model
2019-04-24 04:03:27.666 [kiwi.data.utils load_vocabularies_to_fields:126] Loaded vocabularies from runs/predictor/best_model.torch
2019-04-24 04:03:38.657 [kiwi.lib.train run:187] Estimator(
(predictor_tgt): Predictor(
(attention): Attention(
(scorer): MLPScorer(
(layers): ModuleList(
(0): Sequential(
(0): Linear(in_features=1600, out_features=800, bias=True)
(1): Tanh()
)
(1): Sequential(
(0): Linear(in_features=800, out_features=1, bias=True)
(1): Tanh()
)
)
)
)
(embedding_source): Embedding(9300, 200, padding_idx=1)
(embedding_target): Embedding(3845, 200, padding_idx=1)
(lstm_source): LSTM(200, 400, num_layers=2, batch_first=True, dropout=0.5, bidirectional=True)
(forward_target): LSTM(200, 400, num_layers=2, batch_first=True, dropout=0.5)
(backward_target): LSTM(200, 400, num_layers=2, batch_first=True, dropout=0.5)
(W1): Embedding(3845, 200, padding_idx=1)
(_loss): CrossEntropyLoss()
)
(mlp): Sequential(
(0): Linear(in_features=1000, out_features=125, bias=True)
(1): Tanh()
)
(lstm): LSTM(125, 125, batch_first=True, bidirectional=True)
(embedding_out): Linear(in_features=250, out_features=2, bias=True)
(sentence_pred): Sequential(
(0): Linear(in_features=250, out_features=125, bias=True)
(1): Sigmoid()
(2): Linear(in_features=125, out_features=62, bias=True)
(3): Sigmoid()
(4): Linear(in_features=62, out_features=1, bias=True)
)
(xents): ModuleDict(
(tags): CrossEntropyLoss()
)
(mse_loss): MSELoss()
)
2019-04-24 04:03:38.658 [kiwi.lib.train run:188] 16202078 parameters
2019-04-24 04:03:38.670 [kiwi.trainers.trainer run:74] Epoch 1 of 10
Batches: 0%| | 1/232 [00:02<09:19, 2.42s/ batches]
Traceback (most recent call last):
File "estimator_train_sl.py", line 4, in
kiwi.train(estimator_config)
File "/home2/zyl/code/OpenKiwi-master/kiwi/lib/train.py", line 79, in train_from_file
return train_from_options(options)
File "/home2/zyl/code/OpenKiwi-master/kiwi/lib/train.py", line 123, in train_from_options
trainer = run(ModelClass, output_dir, pipeline_options, model_options)
File "/home2/zyl/code/OpenKiwi-master/kiwi/lib/train.py", line 204, in run
trainer.run(train_iter, valid_iter, epochs=pipeline_options.epochs)
File "/home2/zyl/code/OpenKiwi-master/kiwi/trainers/trainer.py", line 75, in run
self.train_epoch(train_iterator, valid_iterator)
File "/home2/zyl/code/OpenKiwi-master/kiwi/trainers/trainer.py", line 95, in train_epoch
outputs = self.train_step(batch)
File "/home2/zyl/code/OpenKiwi-master/kiwi/trainers/trainer.py", line 139, in train_step
model_out = self.model(batch)
File "/home2/zyl/anaconda3/envs/openkiwi/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home2/zyl/code/OpenKiwi-master/kiwi/models/predictor_estimator.py", line 324, in forward
model_out_tgt = self.predictor_tgt(batch)
File "/home2/zyl/anaconda3/envs/openkiwi/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home2/zyl/code/OpenKiwi-master/kiwi/models/predictor.py", line 275, in forward
for i in range(target_len - 2)
File "/home2/zyl/code/OpenKiwi-master/kiwi/models/predictor.py", line 275, in
for i in range(target_len - 2)
File "/home2/zyl/anaconda3/envs/openkiwi/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home2/zyl/code/OpenKiwi-master/kiwi/models/modules/attention.py", line 36, in forward
scores = self.scorer(query, keys)
File "/home2/zyl/anaconda3/envs/openkiwi/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home2/zyl/code/OpenKiwi-master/kiwi/models/modules/scorer.py", line 60, in forward
layer_in = layer(layer_in)
File "/home2/zyl/anaconda3/envs/openkiwi/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home2/zyl/anaconda3/envs/openkiwi/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward
input = module(input)
File "/home2/zyl/anaconda3/envs/openkiwi/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home2/zyl/anaconda3/envs/openkiwi/lib/python3.6/site-packages/torch/nn/modules/activation.py", line 292, in forward
return torch.tanh(input)
RuntimeError: CUDA out of memory. Tried to allocate 57.62 MiB (GPU 0; 10.92 GiB total capacity; 6.78 GiB already allocated; 31.50 MiB free; 109.37 MiB cached)

Thanks a lot!

from openkiwi.

captainvera avatar captainvera commented on July 30, 2024

Hi @Zachary-YL we will look into what is happening on the predict pipeline.

Meanwhile, the error that you're getting when training on a GPU just means that OpenKiwi is trying to allocate more memory than what is available on your GPU. This happens when the combination of Batch_size and N of tokens on a sentence is too large.

You can easily train using the GPU if you do one of two things (or both):

  • Reduce batch size
  • set the source-max-length and target-max-length flags in the training yaml

from openkiwi.

trenous avatar trenous commented on July 30, 2024

Hello Zachary,

Sorry for the long delay in response, our team was busy with the WMT shared task.

I have run your predict-yaml with the model you provided (changing source and target to a toy file) and it worked fine without error.
Are you sure it is not a version issue? The first release of OpenKiwi was breaking when training for sentence level only.

from openkiwi.

trenous avatar trenous commented on July 30, 2024

I am closing this as it seems to be solved.

from openkiwi.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.