When I tried to run as the follow: python src/main.py tr

AttributeError: 'LeafParseNode' object has no attribute 'oracle_label' about self-attentive-parser HOT 3 CLOSED

nikitakit commented on June 3, 2024

AttributeError: 'LeafParseNode' object has no attribute 'oracle_label'

from self-attentive-parser.

Comments (3)

nikitakit commented on June 3, 2024 1

This is a known limitation with how the parser is currently formulated: each tree must contain a non-terminal node at the root. In your example above, the "part-of-speech" tag PU is the root node, which breaks that assumption.

The way I usually work around this is by wrapping all trees with a "TOP" symbol for the root node, and then disabling the "strip_top" flag when the treebank is known to contain examples that don't have a root non-terminal label.

from self-attentive-parser.

nikitakit commented on June 3, 2024 1

It looks like you're using the English ELMo embeddings to learn a parser for Chinese, which works poorly due to the language mismatch. I recommend that you use the Chinese version of BERT instead. Here is the command needed for training the model: https://github.com/nikitakit/self-attentive-parser/blob/master/EXPERIMENTS.md#chinese-models

It also looks like our dataset sizes don't match. See here for CTB data preparation scripts. My data sizes are 17544 train / 352 dev / 348 test.

from self-attentive-parser.

xiaoxiaoAurora commented on June 3, 2024

This is a known limitation with how the parser is currently formulated: each tree must contain a non-terminal node at the root. In your example above, the "part-of-speech" tag PU is the root node, which breaks that assumption.

The way I usually work around this is by wrapping all trees with a "TOP" symbol for the root node, and then disabling the "strip_top" flag when the treebank is known to contain examples that don't have a root non-terminal label.

Great! Thank you very much! That is very helpful.
However, When I tried to train the model on CTB data as the following comand:

python src/main.py train --use-elmo --model-path-base models/ctb_elmo --num-layers 4 --epochs 500

I got a result:

epoch 134 batch 112/112 processed 2,985,118 batch-loss 6.4597 grad-norm 17.0195 epoch-elapsed 0h05m52s total-elapsed 13h54m52s  
dev-fscore (Recall=84.26, Precision=85.77, FScore=85.01, CompleteMatch=19.32) dev-elapsed 0h00m15s total-elapsed 13h55m07s  
Terminating due to lack of improvement in dev fscore.

and this score is less than your "91.69 F1 on CTB 5.1 test set in benepar_zh"
my CTB data size:

train data: 22277
dev data: 1761

can you provide parameter settings in detail?

from self-attentive-parser.

AttributeError: 'LeafParseNode' object has no attribute 'oracle_label' about self-attentive-parser HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent