Comments (3)
hi, we found the best model (0050000.tar) for rl training after mle training, Although the rouge-L score improved, but the rouge-1 and rouge-2 score became very bad .
we show the eval:(we use rouge-1 for evaluation).
mle (official testset):
Training mle: yes, Training rl: no, mle weight: 1.00, rl weight: 0.00
intra_encoder: True intra_decoder: True
0005000.tar rouge_1: 0.3174
0010000.tar rouge_1: 0.3249
0015000.tar rouge_1: 0.3289
0020000.tar rouge_1: 0.3325
0025000.tar rouge_1: 0.3331
0030000.tar rouge_1: 0.3357
0035000.tar rouge_1: 0.3379
0040000.tar rouge_1: 0.3355
0045000.tar rouge_1: 0.3382
0050000.tar rouge_1: 0.3426
0055000.tar rouge_1: 0.3384
0060000.tar rouge_1: 0.3339
0065000.tar rouge_1: 0.3410
0070000.tar rouge_1: 0.3408
0075000.tar rouge_1: 0.3425
0080000.tar rouge_1: 0.3384
0085000.tar rouge_1: 0.3362
0090000.tar rouge_1: 0.3424
0095000.tar rouge_1: 0.3377
0100000.tar rouge_1: 0.3361
0105000.tar rouge_1: 0.3357
0110000.tar rouge_1: 0.3389
0115000.tar rouge_1: 0.3374
0120000.tar rouge_1: 0.3341
0125000.tar rouge_1: 0.3357
0130000.tar rouge_1: 0.3377
0135000.tar rouge_1: 0.3317
0140000.tar rouge_1: 0.3321
0145000.tar rouge_1: 0.3349
0150000.tar rouge_1: 0.3363
rl (official testset):
in_rl=yes --mle_weight=0.0 --load_model=0050000.tar --new_lr=0.0001
Training mle: no, Training rl: yes, mle weight: 0.00, rl weight: 1.00
intra_encoder: True intra_decoder: True
Loaded model at data/saved_models/0050000.tar
0050000.tar rouge_1: 0.3426
0055000.tar rouge_1: 0.2522
0060000.tar rouge_1: 0.2520
0065000.tar rouge_1: 0.2549
0070000.tar rouge_1: 0.2550
0075000.tar rouge_1: 0.2547
0080000.tar rouge_1: 0.2584
0085000.tar rouge_1: 0.2576
0090000.tar rouge_1: 0.2543
0095000.tar rouge_1: 0.2567
0100000.tar rouge_1: 0.2562
0105000.tar rouge_1: 0.2556
0110000.tar rouge_1: 0.2547
0115000.tar rouge_1: 0.2575
0120000.tar rouge_1: 0.2543
0125000.tar rouge_1: 0.2581
0130000.tar rouge_1: 0.2534
0135000.tar rouge_1: 0.2533
0140000.tar rouge_1: 0.2526
0145000.tar rouge_1: 0.2511
0150000.tar rouge_1: 0.2547mle result:
0075000.tar scores: {'rouge-1': {'f': 0.3424728366572667, 'p': 0.39166721241721236, 'r': 0.31968494072078807}, 'rouge-2': {'f': 0.1732520206640223, 'p': 0.19845553983053968, 'r': 0.1623725413112666}, 'rouge-l': {'f': 0.32962985739519235, 'p': 0.3758193750693756, 'r': 0.3075168451832533}}rl result:
0080000.tar scores: {'rouge-1': {'f': 0.2574669041724543, 'p': 0.21302155489848726, 'r': 0.34803503077209935}, 'rouge-2': {'f': 0.11896310475645827, 'p': 0.09758671687502977, 'r': 0.16587082443700088}, 'rouge-l': {'f': 0.35379459020991105, 'p': 0.39799812070645335, 'r': 0.33855028225319733}}0125000.tar scores: {'rouge-1': {'f': 0.25674349158898563, 'p': 0.21440196978373974, 'r': 0.34277860517537473}, 'rouge-2': {'f': 0.11907341598225046, 'p': 0.09900864566338015, 'r': 0.16306397570581008}, 'rouge-l': {'f': 0.35462601354567735, 'p': 0.40579230645897313, 'r': 0.33368591052575747}}
thanks for your help!
@pengzhi123 can you please let me know the system specification that you have used? I am trying to run this in windows machine with 32 GB RAM, I don't have CUDA enabled in my system.
I doubt this code won't run properly in a windows environment? please advise
from text-summarizer-pytorch.
嗨,我们找到了进行mle训练后rl训练的最佳模型(0050000.tar),尽管rouge-L得分有所提高,但rouge-1和rouge-2得分却很差。
我们显示eval :(我们使用rouge-1进行评估)。
mle(官方测试集):
训练mle:是,训练rl:否,mle权重:1.00,rl权重:0.00
intra_encoder:真正的intra_decoder:真正
0005000.tar rouge_1:0.3174
0010000.tar rouge_1:0.3249
0015000.tar rouge_1:0.3289
0020000 .tar rouge_1:0.3325
0025000.tar rouge_1:0.3331
0030000.tar rouge_1:0.3357
0035000.tar rouge_1:0.3379
0040000.tar rouge_1:0.3355
0045000.tar rouge_1:0.3382
0050000.tar rouge_1:0.3426
0055000.tar rouge_1:0.3384
0060000.tar rouge_1:0.3339
0065000.tar rouge_1:0.3410
0070000.tar rouge_1:0.3408
0075000.tar rouge_1:0.3425
0080000.tar rouge_1:0.3384
0085000.tar rouge_1:0.3362
0090000.tar rouge_1:0.3424
0095000。 tar rouge_1:0.3377
0100000.tar rouge_1:0.3361
0105000.tar rouge_1:0.3357
0110000.tar rouge_1:0.3389
0115000.tar rouge_1:0.3374
0120000.tar rouge_1:0.3341
0125000.tar rouge_1:0.3357
0130000.tar rouge_1:0.3377
0135000.tar rouge_1 :0.3317
0140000.tar rouge_1:0.3321
0145000.tar rouge_1:0.3349
0150000.tar rouge_1:0.3363
rl(官方测试集):
in_rl = yes --mle_weight = 0.0 --load_model = 0050000.tar --new_lr = 0.0001
Training mle:no,Training rl:yes,mle weight:0.00,rl weight:1.00
intra_encoder:True intra_decoder:True
在数据/处加载模型saved_models / 0050000.tar
0050000.tar rouge_1:0.3426
0055000.tar rouge_1:0.2522
0060000.tar rouge_1:0.2520
0065000.tar rouge_1:0.2549
0070000.tar rouge_1:0.2550
0075000.tar rouge_1:0.2547
0080000.tar rouge_1:0.2584
0085000.tar rouge_1:0.2576
0090000.tar rouge_1:0.2543
0095000.tar rouge_1:0.2567
0100000.tar rouge_1:0.2562
0105000.tar rouge_1:0.2556
0110000.tar rouge_1:0.2547
0115000.tar rouge_1:0.2575
0120000.tar rouge_1:0.2543
0125000.tar rouge_1:0.2581
0130000.tar rouge_1:0.2534
0135000.tar rouge_1:0.2533
0140000.tar rouge_1:0.2526
0145000.tar rouge_1:0.2511
0150000.tar rouge_1:0.2547
mle结果:
0075000.tar得分:{'rouge-1':{'f':0.3424728366572667,'p':0.39166721241721236,'r':0.31968494072078807},'rouge-2':{'f':0.1732520206640223,'p ':0.19845553983053968,'r':0.1623725413112666},'rouge-l':{'f':0.32962985739519235,'p':0.3758193750693756,'r':0.3075168451832533}}}
rl结果:
0080000.tar得分:{'rouge-1':{'f':0.2574669041724543,'p':0.21302155489848726,'r':0.34803503077209935},'rouge-2':{'f':0.11896310475645827,'p ':0.09758671687502977,'r':0.16587082443700088},'rouge-l':{'f':0.35379459020991105,'p':0.39799812070645335,'r':0.33855028225319733}}
0125000.tar得分:{'rouge-1':{'f':0.25674349158898563,'p':0.21440196978373974,'r':0.34277860517537473},'rouge-2':{'f':0.11907341598225046,'p':0.09900864566338015 ,'r':0.16306397570581008},'rouge-l':{'f':0.35462601354567735,'p':0.40579230645897313,'r':0.33368591052575747}}
感谢您的帮助!@ pengzhi123 您可以让我知道您使用的系统规格吗?我试图在具有32 GB RAM的Windows计算机中运行此程序,但我的系统未启用CUDA。
我怀疑这段代码无法在Windows环境中正常运行吗?请指教
You should use ubuntu, not windows.
hi, we found the best model (0050000.tar) for rl training after mle training, Although the rouge-L score improved, but the rouge-1 and rouge-2 score became very bad .
we show the eval:(we use rouge-1 for evaluation).
mle (official testset):
Training mle: yes, Training rl: no, mle weight: 1.00, rl weight: 0.00
intra_encoder: True intra_decoder: True
0005000.tar rouge_1: 0.3174
0010000.tar rouge_1: 0.3249
0015000.tar rouge_1: 0.3289
0020000.tar rouge_1: 0.3325
0025000.tar rouge_1: 0.3331
0030000.tar rouge_1: 0.3357
0035000.tar rouge_1: 0.3379
0040000.tar rouge_1: 0.3355
0045000.tar rouge_1: 0.3382
0050000.tar rouge_1: 0.3426
0055000.tar rouge_1: 0.3384
0060000.tar rouge_1: 0.3339
0065000.tar rouge_1: 0.3410
0070000.tar rouge_1: 0.3408
0075000.tar rouge_1: 0.3425
0080000.tar rouge_1: 0.3384
0085000.tar rouge_1: 0.3362
0090000.tar rouge_1: 0.3424
0095000.tar rouge_1: 0.3377
0100000.tar rouge_1: 0.3361
0105000.tar rouge_1: 0.3357
0110000.tar rouge_1: 0.3389
0115000.tar rouge_1: 0.3374
0120000.tar rouge_1: 0.3341
0125000.tar rouge_1: 0.3357
0130000.tar rouge_1: 0.3377
0135000.tar rouge_1: 0.3317
0140000.tar rouge_1: 0.3321
0145000.tar rouge_1: 0.3349
0150000.tar rouge_1: 0.3363
rl (official testset):
in_rl=yes --mle_weight=0.0 --load_model=0050000.tar --new_lr=0.0001
Training mle: no, Training rl: yes, mle weight: 0.00, rl weight: 1.00
intra_encoder: True intra_decoder: True
Loaded model at data/saved_models/0050000.tar
0050000.tar rouge_1: 0.3426
0055000.tar rouge_1: 0.2522
0060000.tar rouge_1: 0.2520
0065000.tar rouge_1: 0.2549
0070000.tar rouge_1: 0.2550
0075000.tar rouge_1: 0.2547
0080000.tar rouge_1: 0.2584
0085000.tar rouge_1: 0.2576
0090000.tar rouge_1: 0.2543
0095000.tar rouge_1: 0.2567
0100000.tar rouge_1: 0.2562
0105000.tar rouge_1: 0.2556
0110000.tar rouge_1: 0.2547
0115000.tar rouge_1: 0.2575
0120000.tar rouge_1: 0.2543
0125000.tar rouge_1: 0.2581
0130000.tar rouge_1: 0.2534
0135000.tar rouge_1: 0.2533
0140000.tar rouge_1: 0.2526
0145000.tar rouge_1: 0.2511
0150000.tar rouge_1: 0.2547
mle result:
0075000.tar scores: {'rouge-1': {'f': 0.3424728366572667, 'p': 0.39166721241721236, 'r': 0.31968494072078807}, 'rouge-2': {'f': 0.1732520206640223, 'p': 0.19845553983053968, 'r': 0.1623725413112666}, 'rouge-l': {'f': 0.32962985739519235, 'p': 0.3758193750693756, 'r': 0.3075168451832533}}
rl result:
0080000.tar scores: {'rouge-1': {'f': 0.2574669041724543, 'p': 0.21302155489848726, 'r': 0.34803503077209935}, 'rouge-2': {'f': 0.11896310475645827, 'p': 0.09758671687502977, 'r': 0.16587082443700088}, 'rouge-l': {'f': 0.35379459020991105, 'p': 0.39799812070645335, 'r': 0.33855028225319733}}
0125000.tar scores: {'rouge-1': {'f': 0.25674349158898563, 'p': 0.21440196978373974, 'r': 0.34277860517537473}, 'rouge-2': {'f': 0.11907341598225046, 'p': 0.09900864566338015, 'r': 0.16306397570581008}, 'rouge-l': {'f': 0.35462601354567735, 'p': 0.40579230645897313, 'r': 0.33368591052575747}}
thanks for your help!@pengzhi123 can you please let me know the system specification that you have used? I am trying to run this in windows machine with 32 GB RAM, I don't have CUDA enabled in my system.
I doubt this code won't run properly in a windows environment? please advise
You should use ubuntu, not windows.
from text-summarizer-pytorch.
hello, I met "stopIeration error' when runing eval.py. Have you ever met the same problem? If you have, please tell me how to sovle it.
from text-summarizer-pytorch.
Related Issues (20)
- intra decoder code doesn't match paper HOT 4
- No coverage for pointer generator? HOT 1
- Training time HOT 1
- can use_gound_truth make result better? why use_gound_truth? HOT 1
- Results on CNN/DM dataset? HOT 2
- the problem of x_t HOT 1
- No such file file or directory "data/vocab"
- python2 not support tensorflow HOT 2
- RuntimeError: CUDA error: device-side assert triggered while running MLE + RL training HOT 1
- Accessing examples HOT 2
- article portion is not there HOT 2
- How long it takes in a non GPU system HOT 3
- Error in validation and testing HOT 1
- Error in beam_search.py HOT 1
- Are encoder decoder in model. py GAE?
- How to predict a single summary after training?
- determine max and min summary lengths
- please provide the best trained model
- Type error in line 151
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from text-summarizer-pytorch.