Hi, Thanks for creating this repo! When I ran the

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

How to get the 78.6 F1 score? about drqa HOT 8 CLOSED

hitvoice commented on August 21, 2024

How to get the 78.6 F1 score?

from drqa.

Comments (8)

hitvoice commented on August 21, 2024 1

@taolei87 Hi, I finally completed the improvements. Now the code in this project is supposed to reach the scores reported in the original paper and the preprocessing code runs much faster. Check this out~

from drqa.

hitvoice commented on August 21, 2024

I set the best options as the default ones, as far as I can remember. I will run the experiment again and check this now. Thanks for your feedback :)

from drqa.

hitvoice commented on August 21, 2024

Whoops, I think I've found the logs of the experiment that produced the reported result.

TLDR: Decay the learning rate by a factor of 0.2 after the 20th epoch, you'll get the best result at Epoch 22!

command:

python train.py -e 20 -bs 32
python train.py -e 5 -rs checkpoint_epoch_20.pt -ro -rlr 0.2

The logs:

06/28/2017 03:01:24 Epoch 1
06/28/2017 03:14:55 dev EM: 56.95364238410596 F1: 67.8683825023103
06/28/2017 03:14:55 Epoch 2
06/28/2017 03:28:25 dev EM: 62.13812677388837 F1: 72.04167861231319
06/28/2017 03:28:30 Epoch 3
06/28/2017 03:41:58 dev EM: 63.614001892147584 F1: 73.24516372886401
06/28/2017 03:42:03 Epoch 4
06/28/2017 03:55:33 dev EM: 65.08987701040681 F1: 74.93402003131555
06/28/2017 03:55:38 Epoch 5
06/28/2017 04:09:08 dev EM: 66.30085146641439 F1: 76.29230596521498
06/28/2017 04:09:13 Epoch 6
06/28/2017 04:22:42 dev EM: 66.88741721854305 F1: 76.77407724544786
06/28/2017 04:22:47 Epoch 7
06/28/2017 04:36:16 dev EM: 67.78618732261117 F1: 77.4174293709857
06/28/2017 04:36:22 Epoch 8
06/28/2017 04:49:51 dev EM: 68.39167455061495 F1: 77.72788370820382
06/28/2017 04:49:57 Epoch 9
06/28/2017 05:03:26 dev EM: 68.61873226111636 F1: 77.96250138518607
06/28/2017 05:03:31 Epoch 10
06/28/2017 05:16:59 dev EM: 68.47682119205298 F1: 77.75544906108188
06/28/2017 05:17:00 Epoch 11
06/28/2017 05:30:28 dev EM: 68.39167455061495 F1: 77.89913272628915
06/28/2017 05:30:29 Epoch 12
06/28/2017 05:43:58 dev EM: 68.92147587511826 F1: 78.23847042319981
06/28/2017 05:44:04 Epoch 13
06/28/2017 05:57:32 dev EM: 68.46736045411542 F1: 77.9351043550382
06/28/2017 05:57:32 Epoch 14
06/28/2017 06:11:01 dev EM: 68.10785241248817 F1: 77.50540463923846
06/28/2017 06:11:02 Epoch 15
06/28/2017 06:24:31 dev EM: 68.61873226111636 F1: 78.14258166153962
06/28/2017 06:24:31 Epoch 16
06/28/2017 06:38:00 dev EM: 68.83632923368023 F1: 78.13731036452529
06/28/2017 06:38:00 Epoch 17
06/28/2017 06:51:29 dev EM: 68.53358561967833 F1: 78.11900607218556
06/28/2017 06:51:30 Epoch 18
06/28/2017 07:05:24 dev EM: 68.16461684011352 F1: 77.89089688627213
06/28/2017 07:05:25 Epoch 19
06/28/2017 07:19:03 dev EM: 69.12961210974456 F1: 78.35958772815894
06/28/2017 07:19:08 Epoch 20
06/28/2017 07:32:36 dev EM: 69.17691579943235 F1: 78.30739579214192
06/28/2017 08:04:05 Epoch 21
06/28/2017 08:17:37 dev EM: 68.72280037842951 F1: 77.99082545571326
06/28/2017 08:17:38 Epoch 22
06/28/2017 08:31:10 dev EM: 68.71333964049195 F1: 78.20651022825174
06/28/2017 08:31:10 Epoch 23
06/28/2017 08:41:22 [loading previous model...]
06/28/2017 08:41:27 [learning rate reduced by 0.2]
06/28/2017 08:42:05 Epoch 21
06/28/2017 08:55:40 dev EM: 69.2620624408704 F1: 78.44813635532361
06/28/2017 08:55:49 Epoch 22
06/28/2017 09:09:22 dev EM: 69.15799432355723 F1: 78.57088272408215
06/28/2017 09:09:32 Epoch 23
06/28/2017 09:23:04 dev EM: 68.93093661305582 F1: 78.25261296378513
06/28/2017 09:23:05 Epoch 24
06/28/2017 09:36:39 dev EM: 68.92147587511826 F1: 78.36606035049084
06/28/2017 09:36:40 Epoch 25
06/28/2017 09:50:12 dev EM: 69.10122989593188 F1: 78.39157962955184
06/28/2017 09:50:13 Epoch 26
06/28/2017 10:03:46 dev EM: 68.9593188268685 F1: 78.32544800754403
06/28/2017 10:03:46 Epoch 27
06/28/2017 10:17:19 dev EM: 68.76064333017976 F1: 78.13407753154141
06/28/2017 10:17:20 Epoch 28
06/28/2017 10:30:53 dev EM: 68.93093661305582 F1: 78.3647595641627
06/28/2017 10:30:54 Epoch 29
06/28/2017 10:44:27 dev EM: 68.86471144749291 F1: 78.25744308022497
06/28/2017 10:44:27 Epoch 30
06/28/2017 10:57:58 dev EM: 69.04446546830653 F1: 78.34886954385462

I think this'll be the right way (as far as I can remember) and I'll test this now. If it's verified I'll update the README.

from drqa.

taolei87 commented on August 21, 2024

hi @hitvoice ,

Thanks for the quick reply. Please let me know if I need to change any options.

Best,
Tao

from drqa.

hitvoice commented on August 21, 2024

After the same learning rate scheduling, I could get only EM 69.23, F1 78.52 after 20+6 epochs with random seed 614 (and this should be reproducible). The log file I posted above is real but I can't find out what parameter sets or random seed I used to produce that result...
I should have kept all the detailed settings of each experiment and I'm sorry for making you confused. Although 0.1% is not a big gap, I'll try to figure it out anyway when I get some time.

from drqa.

taolei87 commented on August 21, 2024

@hitvoice thanks for the help!

from drqa.

taolei87 commented on August 21, 2024

hi @hitvoice, just wanna say thanks for sharing your implementation. it saves a lot of my time.

I just published a draft on arxiv and I used your code in the SQuAD experiment :)
https://arxiv.org/pdf/1709.02755.pdf
https://github.com/taolei87/sru/tree/master/DrQA

Best,
Tao

from drqa.

hitvoice commented on August 21, 2024

WOW today I've seen the discussions of your paper many times in my social network feeds and newsletters! I just read your paper and it's really amazing! Congratulations on your great progress and thanks for citing this project ❤️

from drqa.

How to get the 78.6 F1 score? about drqa HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent