dsindex / syntaxnet Goto Github PK

reference code for syntaxnet

Shell 40.04% Python 50.87% C++ 3.73% JavaScript 1.70% CSS 0.51% HTML 1.48% Starlark 1.67%

syntaxnet dependency-parser tests sejong-corpus dragnn training-parser training-tagger conll brat universal-dependencies

syntaxnet's Introduction

Table of Contents generated with DocToc

syntaxnet

syntaxnet

description

test code for syntaxnet
- training and test a model using UD corpus.
- training and test a Korean parser model using the Sejong corpus.
- exporting a trained model and serving(limited to the designated version of syntaxnet(old one))
- training and test a model using dragnn.
- comparision to bist-parser.

history

2017. 3. 27

test for dragnn
version

python : 2.7
bazel  : 0.4.3
protobuf : 3.2.0
syntaxnet : 40a5739ae26baf6bfa352d2dec85f5ca190254f8

2017. 3. 10

modify for recent version of syntaxnet(tf 1.0), OS X(bash script), universal treebank v2.0
version

python : 2.7
bazel  : 0.4.3
protobuf : 3.0.0b2, 3.2.0
syntaxnet : bc70271a51fe2e051b5d06edc6b9fd94880761d5

2016. 8. 16
- add 'char-map' to context.pbtxt' for train
- add '--resource_dir' for test
  - if you installed old version of syntaxnet(ex, a4b7bb9a5dd2c021edcd3d68d326255c734d0ef0 ), you should specify path to each files in 'context.pbtxt'
- version
```
syntaxnet : a5d45f2ed20effaabc213a2eb9def291354af1ec
```

how to test

# after installing syntaxnet.
# gpu supporting : https://github.com/tensorflow/models/issues/248#issuecomment-288991859
$ pwd
/path/to/models/syntaxnet
$ git clone https://github.com/dsindex/syntaxnet.git work
$ cd work
$ echo "hello syntaxnet" | ./demo.sh
# training parser only with parsed corpus
$ ./parser_trainer_test.sh

univeral dependency corpus

UD official website
- tutorial
- CoNLL-U format
UPPipe
- udpipe(git)
prepare data

$ cd work
$ mkdir corpus
$ cd corpus
# downloading ud-treebanks-v2.0.tgz
$ tar -zxvf ud-treebanks-v2.0.tgz  
$ ls universal-dependencies-2.0 
$ UD_Ancient_Greek  UD_Basque  UD_Czech ....

training tagger and parser with another corpus

# for example, training UD_English.
# detail instructions can be found in https://github.com/tensorflow/models/tree/master/syntaxnet
$ ./train.sh -v -v
...
#preprocessing with tagger
INFO:tensorflow:Seconds elapsed in evaluation: 9.77, eval metric: 99.71%
INFO:tensorflow:Seconds elapsed in evaluation: 1.26, eval metric: 92.04%
INFO:tensorflow:Seconds elapsed in evaluation: 1.26, eval metric: 92.07%
...
#pretrain parser
INFO:tensorflow:Seconds elapsed in evaluation: 4.97, eval metric: 82.20%
...
#evaluate pretrained parser
INFO:tensorflow:Seconds elapsed in evaluation: 44.30, eval metric: 92.36%
INFO:tensorflow:Seconds elapsed in evaluation: 5.42, eval metric: 82.67%
INFO:tensorflow:Seconds elapsed in evaluation: 5.59, eval metric: 82.36%
...
#train parser
INFO:tensorflow:Seconds elapsed in evaluation: 57.69, eval metric: 83.95%
...
#evaluate parser
INFO:tensorflow:Seconds elapsed in evaluation: 283.77, eval metric: 96.54%
INFO:tensorflow:Seconds elapsed in evaluation: 34.49, eval metric: 84.09%
INFO:tensorflow:Seconds elapsed in evaluation: 34.97, eval metric: 83.49%
...

training parser only

# if you have other pos-tagger and want to build parser only from the parsed corpus :
$ ./train_p.sh -v -v
...
#pretrain parser
...
#evaluate pretrained parser
INFO:tensorflow:Seconds elapsed in evaluation: 44.15, eval metric: 92.21%
INFO:tensorflow:Seconds elapsed in evaluation: 5.56, eval metric: 87.84%
INFO:tensorflow:Seconds elapsed in evaluation: 5.43, eval metric: 86.56%
...
#train parser
...
#evaluate parser
INFO:tensorflow:Seconds elapsed in evaluation: 279.04, eval metric: 94.60%
INFO:tensorflow:Seconds elapsed in evaluation: 33.19, eval metric: 88.60%
INFO:tensorflow:Seconds elapsed in evaluation: 32.57, eval metric: 87.77%
...

test new model

$ echo "this is my own tagger and parser" | ./test.sh
...
Input: this is my own tagger and parser
Parse:
tagger NN ROOT
 +-- this DT nsubj
 +-- is VBZ cop
 +-- my PRP$ nmod:poss
 +-- own JJ amod
 +-- and CC cc
 +-- parser NN conj

# original model
$ echo "this is my own tagger and parser" | ./demo.sh
Input: this is my own tagger and parser
Parse:
tagger NN ROOT
 +-- this DT nsubj
 +-- is VBZ cop
 +-- my PRP$ poss
 +-- own JJ amod
 +-- and CC cc
 +-- parser ADD conj 

$ echo "Bob brought the pizza to Alice ." | ./test.sh
Input: Bob brought the pizza to Alice .
Parse:
brought VBD ROOT
 +-- Bob NNP nsubj
 +-- pizza NN dobj
 |   +-- the DT det
 +-- Alice NNP nmod
 |   +-- to IN case
 +-- . . punct

# original model
$ echo "Bob brought the pizza to Alice ." | ./demo.sh
Input: Bob brought the pizza to Alice .
Parse:
brought VBD ROOT
 +-- Bob NNP nsubj
 +-- pizza NN dobj
 |   +-- the DT det
 +-- to IN prep
 |   +-- Alice NNP pobj
 +-- . . punct

training parser from Sejong treebank corpus

# the corpus is accessible through the path on this image : https://raw.githubusercontent.com/dsindex/blog/master/images/url_sejong.png
# copy sejong_treebank.txt.v1 to `sejong` directory.
$ ./sejong/split.sh
$ ./sejong/c2d.sh
$ ./train_sejong.sh
#pretrain parser
...
NFO:tensorflow:Seconds elapsed in evaluation: 14.18, eval metric: 93.43%
...
#evaluate pretrained parser
INFO:tensorflow:Seconds elapsed in evaluation: 116.08, eval metric: 95.11%
INFO:tensorflow:Seconds elapsed in evaluation: 14.60, eval metric: 93.76%
INFO:tensorflow:Seconds elapsed in evaluation: 14.45, eval metric: 93.78%
...
#evaluate pretrained parser by eoj-based
accuracy(UAS) = 0.903289
accuracy(UAS) = 0.876198
accuracy(UAS) = 0.876888
...
#train parser
INFO:tensorflow:Seconds elapsed in evaluation: 137.36, eval metric: 94.12%
...
#evaluate parser
INFO:tensorflow:Seconds elapsed in evaluation: 1806.21, eval metric: 96.37%
INFO:tensorflow:Seconds elapsed in evaluation: 224.40, eval metric: 94.19%
INFO:tensorflow:Seconds elapsed in evaluation: 223.75, eval metric: 94.25%
...

#evaluate parser by eoj-based
accuracy(UAS) = 0.928845
accuracy(UAS) = 0.886139
accuracy(UAS) = 0.887824
...

test korean parser model

$ cat sejong/tagged_input.sample
1	프랑스	프랑스	NNP	NNP	_	0	_	_	_
2	의	의	JKG	JKG	_	0	_	_	_
3	세계	세계	NNG	NNG	_	0	_	_	_
4	적	적	XSN	XSN	_	0	_	_	_
5	이	이	VCP	VCP	_	0	_	_	_
6	ᆫ	ᆫ	ETM	ETM	_	0	_	_	_
7	의상	의상	NNG	NNG	_	0	_	_	_
8	디자이너	디자이너	NNG	NNG	_	0	_	_	_
9	엠마누엘	엠마누엘	NNP	NNP	_	0	_	_	_
10	웅가로	웅가로	NNP	NNP	_	0	_	_	_
11	가	가	JKS	JKS	_	0	_	_	_
12	실내	실내	NNG	NNG	_	0	_	_	_
13	장식	장식	NNG	NNG	_	0	_	_	_
14	용	용	XSN	XSN	_	0	_	_	_
15	직물	직물	NNG	NNG	_	0	_	_	_
16	디자이너	디자이너	NNG	NNG	_	0	_	_	_
17	로	로	JKB	JKB	_	0	_	_	_
18	나서	나서	VV	VV	_	0	_	_	_
19	었	었	EP	EP	_	0	_	_	_
20	다	다	EF	EF	_	0	_	_	_
21	.	.	SF	SF	_	0	_	_	_

$ cat sejong/tagged_input.sample | ./test_sejong.sh -v -v
Input: 프랑스 의 세계 적 이 ᆫ 의상 디자이너 엠마누엘 웅가로 가 실내 장식 용 직물 디자이너 로 나서 었 다 .
Parse:
. SF ROOT
 +-- 다 EF MOD
     +-- 었 EP MOD
         +-- 나서 VV MOD
             +-- 가 JKS NP_SBJ
             |   +-- 웅가로 NNP MOD
             |       +-- 디자이너 NNG NP
             |       |   +-- 의 JKG NP_MOD
             |       |   |   +-- 프랑스 NNP MOD
             |       |   +-- ᆫ ETM VNP_MOD
             |       |   |   +-- 이 VCP MOD
             |       |   |       +-- 적 XSN MOD
             |       |   |           +-- 세계 NNG MOD
             |       |   +-- 의상 NNG NP
             |       +-- 엠마누엘 NNP NP
             +-- 로 JKB NP_AJT
                 +-- 디자이너 NNG MOD
                     +-- 직물 NNG NP
                         +-- 실내 NNG NP
                         +-- 용 XSN NP
                             +-- 장식 NNG MOD

apply korean POS tagger(Komoran via konlpy)

# after installing konlpy ( http://konlpy.org/ko/v0.4.3/ )
$ python sejong/tagger.py
나는 학교에 간다.
1	나	나	NP	NP	_	0	_	_	_
2	는	는	JX	JX	_	0	_	_	_
3	학교	학교	NNG	NNG	_	0	_	_	_
4	에	에	JKB	JKB	_	0	_	_	_
5	가	가	VV	VV	_	0	_	_	_
6	ㄴ다	ㄴ다	EF	EF	_	0	_	_	_
7	.	.	SF	SF	_	0	_	_	_

$ echo "나는 학교에 간다." | python sejong/tagger.py | ./test_sejong.sh
Input: 나 는 학교 에 가 ㄴ다 .
Parse:
. SF ROOT
 +-- ㄴ다 EF MOD
     +-- 가 VV MOD
         +-- 는 JX NP_SBJ
         |   +-- 나 NP MOD
         +-- 에 JKB NP_AJT
             +-- 학교 NNG MOD

tensorflow serving and syntaxnet

$ bazel-bin/tensorflow_serving/example/parsey_client --server=localhost:9000
나는 학교에 간다
Input :  나는 학교에 간다
Parsing :
{"result": [{"text": "나 는 학교 에 가 ㄴ다", "token": [{"category": "NP", "head": 1, "end": 2, "label": "MOD", "start": 0, "tag": "NP", "word": "나"}, {"category": "JX", "head": 4, "end": 6, "label": "NP_SBJ", "start": 4, "tag": "JX", "word": "는"}, {"category": "NNG", "head": 3, "end": 13, "label": "MOD", "start": 8, "tag": "NNG", "word": "학교"}, {"category": "JKB", "head": 4, "end": 17, "label": "NP_AJT", "start": 15, "tag": "JKB", "word": "에"}, {"category": "VV", "head": 5, "end": 21, "label": "MOD", "start": 19, "tag": "VV", "word": "가"}, {"category": "EC", "end": 28, "label": "ROOT", "start": 23, "tag": "EC", "word": "ㄴ다"}], "docid": "-:0"}]}
...

parsey's cousins

a collection of pretrained syntactic models
how to test

# download models from http://download.tensorflow.org/models/parsey_universal/<language>.zip

# for `English`
$ echo "Bob brought the pizza to Alice." | ./parse.sh

# tokenizing
Bob brought the pizza to Alice .

# morphological analysis
1	Bob	_	_	_	Number=Sing|fPOS=PROPN++NNP	0	_	_	_
2	brought	_	_	_	Mood=Ind|Tense=Past|VerbForm=Fin|fPOS=VERB++VBD	0	_	_	_
3	the	_	_	_	Definite=Def|PronType=Art|fPOS=DET++DT	0	_	_	_
4	pizza	_	_	_	Number=Sing|fPOS=NOUN++NN	0	_	_	_
5	to	_	_	_	fPOS=ADP++IN	0	_	_	_
6	Alice	_	_	_	Number=Sing|fPOS=PROPN++NNP	0	_	_	_
7	.	_	_	_	fPOS=PUNCT++.	0	_	_	_

# tagging
1	Bob	_	PROPN	NNP	Number=Sing|fPOS=PROPN++NNP	0	_	_	_
2	brought	_	VERB	VBD	Mood=Ind|Tense=Past|VerbForm=Fin|fPOS=VERB++VBD	0	_	_	_
3	the	_	DET	DT	Definite=Def|PronType=Art|fPOS=DET++DT	0	_	_	_
4	pizza	_	NOUN	NN	Number=Sing|fPOS=NOUN++NN	0	_	_	_
5	to	_	ADP	IN	fPOS=ADP++IN	0	_	_	_
6	Alice	_	PROPN	NNP	Number=Sing|fPOS=PROPN++NNP	0	_	_	_
7	.	_	PUNCT	.	fPOS=PUNCT++.	0	_	_	_

# parsing
1	Bob	_	PROPN	NNP	Number=Sing|fPOS=PROPN++NNP	2	nsubj	_	_
2	brought	_	VERB	VBD	Mood=Ind|Tense=Past|VerbForm=Fin|fPOS=VERB++VBD	0	ROOT	_	_
3	the	_	DET	DT	Definite=Def|PronType=Art|fPOS=DET++DT	4	det	_	_
4	pizza	_	NOUN	NN	Number=Sing|fPOS=NOUN++NN	2	dobj	_	_
5	to	_	ADP	IN	fPOS=ADP++IN	6	case	_	_
6	Alice	_	PROPN	NNP	Number=Sing|fPOS=PROPN++NNP	2	nmod	_	_
7	.	_	PUNCT	.	fPOS=PUNCT++.	2	punct	_	_

# conll2tree 
Input: Bob brought the pizza to Alice .
Parse:
brought VERB++VBD ROOT
 +-- Bob PROPN++NNP nsubj
 +-- pizza NOUN++NN dobj
 |   +-- the DET++DT det
 +-- Alice PROPN++NNP nmod
 |   +-- to ADP++IN case
 +-- . PUNCT++. punct

downloaded model vs trained model

1. downloaded model
Language	No. tokens	POS	fPOS	Morph	UAS	LAS
-------------------------------------------------------
English	25096	90.48%	89.71%	91.30%	84.79%	80.38%

2. trained model
INFO:tensorflow:Total processed documents: 2077
INFO:tensorflow:num correct tokens: 18634
INFO:tensorflow:total tokens: 22395
INFO:tensorflow:Seconds elapsed in evaluation: 19.85, eval metric: 83.21%

3. where does the difference(84.79% - 83.21%) come from?
as mentioned https://research.googleblog.com/2016/08/meet-parseys-cousins-syntax-for-40.html
they found good hyperparameters by using MapReduce.
for example, 
the hyperparameters for POS tagger :
  - POS_PARAMS=128-0.08-3600-0.9-0
  - decay_steps=3600
  - hidden_layer_sizes=128
  - learning_rate=0.08
  - momentum=0.9

dragnn

how to compile examples

$ cd ../
$ pwd
/path/to/models/syntaxnet
$ bazel build -c opt //examples/dragnn:tutorial_1

training tagger and parser with CoNLL corpus

# compile
$ pwd
/path/to/models/syntaxnet
$ bazel build -c opt //work/dragnn_examples:write_master_spec
$ bazel build -c opt //work/dragnn_examples:train_dragnn
$ bazel build -c opt //work/dragnn_examples:inference_dragnn
# training
$ cd work
$ ./train_dragnn.sh -v -v
...
INFO:tensorflow:training step: 25300, actual: 25300
INFO:tensorflow:training step: 25400, actual: 25400
INFO:tensorflow:finished step: 25400, actual: 25400
INFO:tensorflow:Annotating datset: 2002 examples
INFO:tensorflow:Done. Produced 2002 annotations
INFO:tensorflow:Total num documents: 2002
INFO:tensorflow:Total num tokens: 25148
INFO:tensorflow:POS: 85.63%
INFO:tensorflow:UAS: 79.67%
INFO:tensorflow:LAS: 74.36%
...
# test
$ echo "i love this one" | ./test_dragnn.sh
Input: i love this one
Parse:
love VBP root
 +-- i PRP nsubj
 +-- one CD obj
     +-- this DT det

training parser with Sejong corpus

# compile
$ pwd
/path/to/models/syntaxnet
$ bazel build -c opt //work/dragnn_examples:write_master_spec
$ bazel build -c opt //work/dragnn_examples:train_dragnn
$ bazel build -c opt //work/dragnn_examples:inference_dragnn_sejong
# training
$ cd work
# to prepare corpus, please refer to `training parser from Sejong treebank corpus` section.
$ ./train_dragnn_sejong.sh -v -v
...
INFO:tensorflow:training step: 33100, actual: 33100
INFO:tensorflow:training step: 33200, actual: 33200
INFO:tensorflow:finished step: 33200, actual: 33200
INFO:tensorflow:Annotating datset: 4114 examples
INFO:tensorflow:Done. Produced 4114 annotations
INFO:tensorflow:Total num documents: 4114
INFO:tensorflow:Total num tokens: 97002
INFO:tensorflow:POS: 93.95%
INFO:tensorflow:UAS: 91.38%
INFO:tensorflow:LAS: 87.76%
...
# test
# after installing konlpy ( http://konlpy.org/ko/v0.4.3/ )
$ echo "제주로 가는 비행기가 심한 비바람에 회항했다." | ./test_dragnn_sejong.sh
INFO:tensorflow:Read 1 documents
Input: 제주 로 가 는 비행기 가 심하 ㄴ 비바람 에 회항 하 았 다 .
Parse:
. SF VP
 +-- 다 EF MOD
     +-- 았 EP MOD
         +-- 하 XSA MOD
             +-- 회항 SN MOD
                 +-- 가 JKS NP_SBJ
                 |   +-- 비행기 NNG MOD
                 |       +-- 는 ETM VP_MOD
                 |           +-- 가 VV MOD
                 |               +-- 로 JKB NP_AJT
                 |                   +-- 제주 MAG MOD
                 +-- 에 JKB NP_AJT
                     +-- 비바람 NNG MOD
                         +-- ㄴ SN MOD
                             +-- 심하 VV NP
# it seems that pos tagging results from the dragnn are somewhat incorrect.
# so, i replace those to the results from the Komoran tagger.
# you can modify 'inference_dragnn_sejong.py' to use the tags from the dragnn.
Input: 제주 로 가 는 비행기 가 심하 ㄴ 비바람 에 회항 하 았 다 .
Parse:
. SF VP
 +-- 다 EF MOD
     +-- 았 EP MOD
         +-- 하 XSV MOD
             +-- 회항 NNG MOD
                 +-- 가 JKS NP_SBJ
                 |   +-- 비행기 NNG MOD
                 |       +-- 는 ETM VP_MOD
                 |           +-- 가 VV MOD
                 |               +-- 로 JKB NP_AJT
                 |                   +-- 제주 NNG MOD
                 +-- 에 JKB NP_AJT
                     +-- 비바람 NNG MOD
                         +-- ㄴ ETM MOD
                             +-- 심하 VA NP

web api using tornado

how to run

# compile
$ pwd
/path/to/models/syntaxnet
$ bazel build -c opt //work/dragnn_examples:dragnn_dm
# start tornado web api
$ cd work/dragnn_examples/www
# start single process
$ ./start.sh -v -v 0 0
# despite tornado suppoting multi-processing, session of tensorflow is not fork-safe.
# so do not use multi-processing option.
# if you want to link to the model trained by Sejong corpus, just edit env.sh
# : enable_konlpy='True'

# http://hostip:8897 
# http://hostip:8897/dragnn?q=i love it
# http://hostip:8897/dragnn?q=나는 학교에 가서 공부했다.

api output format(sample)

brat annotation tool

comparison to BIST parser

syntaxnet's People

Contributors

Stargazers

Watchers

Forkers

guduxingzou angrydata rainmaker712 yanzqing j-min bobguns chzlithoo johndpope ilyeong-ai jbb100 chenmoshushi lineryang khan007 yeppha namjae nrobin alexfridlyand louiekang stevenlol chagge jhowliu leezqcst wmx3ng bahrie127 mskylsjwg wangzhaolang roboreport shephexd hmilysls songjein summatic wudeshi nlc-jychoi ponteineptique andyrbm zhoujian1210 hailiang-wang dgreen2017 wellslee ledavinci binzhouchn yunjeonglee hbcbh1999 pushkarajc apurvnagvenkar minju54 shushengyang huangxiancun kyuhwas sayduke lazuraslong hpylieva youikim greitzmann hgfdsa102 ostazia

syntaxnet's Issues

How to generate .pb file for android

my android project need analysis user's intent by strings, so i want to use syntaxnet and tensorflow-android sample ,and tensorflow-android need .pb file,but i don't know how to generate it by syntaxnet , if someone done the same thing,tell me the method ,thanks

DRAGNN - Tensorflow Serving

Hello again!:)

Could you please share your knowledge about Dragnn and tf serving integration?
Do you know current state?
I see that you did wrapper for REST with tornado for dragnn_script, that basically does conllu/proto to json.

I just want to hear your thoughts on all this.
I've seen your integration with README_API, but from what i understand it's for syntaxnet.

I'm really trying to investigate what it will take to make tf1.0+latest serving + dragnn.

Thank you in advance!

Parser a custom corpus

Hi, first of all thanks very much for your scripts. I was dealing 2 weeks with Syntaxnet to make it work and with your files it was very easy. So, THANKS!

My question it's about parser a custom corpus. I want to parser SFU, PangLee2004 and PangLee2005. Now, I have commented:

#pretrain_parser
#evaluate_pretrained_parser
#train_parser
evaluate_parser
#copy_model
close_fd

To only execute the evaluation phase and avoid training again. I have tried to change in (corpus_folder)/context.pbtxt_p on:

input {
name: 'tagged-test-corpus'
record_format: 'conll-sentence'
Part {
file_pattern: '/home/iago/Escritorio/Probar_Parser_Google/UT_English/en-ut-test.conllu.conv'
}
}

to:

input {
name: 'tagged-test-corpus'
record_format: 'conll-sentence'
Part {
file_pattern: '/home/iago/Escritorio/Probar_Parser_Google/UT_English/en-ut-SFU.conll' (for example)
}
}

And then execute ~$./train_p.sh -v -v but I don't know if this is the correct way. I'm doing it right?

Thanks and regards.

Serve a different language model?

I followed the instructions from here:
https://github.com/dsindex/syntaxnet/blob/master/README_api.md

This allowed me to run the server and use it using a client based on the parsey_api.proto and sentence.proto files (I implemented my own Java client for this).

What I do not understand is how to best switch the model to one of "Parsey's cousins" ? Eg if I download the German model http://download.tensorflow.org/models/parsey_universal/German.zip, how can I use that model instead of the English one? Can I use the same client with that model?

Question about "installing syntaxnet"

I am a beginner of SyntaxNet.

(1) Basically when you mention 'installing syntaxnet", you mean build syntaxnet from source and then 'pip install' ? In other word, "Manual installation" in the https://github.com/tensorflow/models/tree/master/research/syntaxnet? Please confirm. thanks

(2) Then, my question:

What is 'Ubuntu 16.10+ binary installation" section for? I followed the section to install syntaxnet. How do I modify your scripts in this git if I want to use binary installation?

I found binary installation installed "parser_ops.so" (part of python syntaxnet package). Do you know where could I find the C++ API guide if I want to call APIs of parser_ops.so from C++ source code?

thanks.

How to run conll17 dragnn baseline model?

Hello!
Amazing work!
Could you please tell me how with your scripts to pass text file to the test dragnn script (Russian baseline model) and output in CoNLL format?

If possible please provide detailed instruction. Where to copy files and so on ....
Thanks in advance!

Where is the context.pbtxt in UD_language?

Hi @dsindex ,
I altered CORPUS_DIR=${CDIR}/UD_English to CORPUS_DIR=${CDIR}/corpus/ud-treebanks-v2.0/UD_Chinese. I thought this can train the Chinese corpus.
When I run $ ./train.sh -v -v, It says there is can not find context.pbtxt in UD_Chinese directory.
I am kind of confused. You have a directory named English, which contains the filed obtained from training. Also, you have context.pbtxt in UD_English. How can I run the train.sh without the context.pbtxt in UD_Chinese?
Am I misunderstanding something in here?

Training text segmentation and morphological analysis

It would be great to have information on how to train the text segmentation or morphological analysis parts of SyntaxNet (as far as I know in Parsey's Cousins they are referred to as tokenizer and morpher). And additionally, it would be nice to know what they are actually used for.

Errors when running the server script

I found errors when building "parsey_api" for the server here:
bazel --output_user_root=bazel_root build --nocheck_visibility -c opt -s //tensorflow_serving/example:parsey_api --genrule_strategy=standalone --spawn_strategy=standalone --verbose_failures

Extracting Bazel installation...
........
ERROR: Failed to load Skylark extension '//tensorflow/tensorflow:workspace.bzl'.
It usually happens when the repository is not defined prior to being used.
Maybe repository '' was defined later in your WORKSPACE file?
ERROR: cycles detected during target parsing.
INFO: Elapsed time: 2.306s
./batch.sh: line 52: ./bazel-bin/tensorflow_serving/example/parsey_api: No such file or directory

I am using:
Ubuntu 14.04
Bazel 0.4.4
python gPRC installed

Should I install the C++ version of gRPC?

Thank you.

cannot import name graph_builder

Hi @dsindex
I read your train.sh. It calls /private/var/tmp/_bazel_aluminumbox/dbea27ee7390ed619a92467ce8d5c86b/execroot/main/bazel-out/local-opt/bin/syntaxnet/parser_trainer.
and then parser_trainer calls
/private/var/tmp/_bazel_aluminumbox/dbea27ee7390ed619a92467ce8d5c86b/execroot/main/bazel-out/local-opt/bin/syntaxnet/parser_trainer.py.

I can run train.sh and parser_trainer. However, when I tried to debug parser_trainer.py and input the arguments in the configuration. Below error occurs:

ImportError: cannot import name graph_builder

I found that there is graph_builder and structured_graph_builder in the syntaxnet repository, but not in the python envirionment (site-packages/syntaxnet/).

Did I make mistakes during the installation? Do you have these two python scripts in your python environment?

Running SyntaxNet with designated instance (in Python-level)

I posted this question at stackoverflow but didn't get a good answer yet.

Could you please let me know how I designate which instance to use when training/testing SyntaxNet?

In other tensorflow models we can easily change configurations by editing Python code:

ex) tf.device('/cpu:0') => tf.device('/gpu:0').

I could run parsey mcparseface model via running demo.sh and I followed back symbolic links to find device configurations.

Maybe I missed something. But I cannot find gpu configuration python codes in demo.sh, parser_eval.py and context.proto.

When I search with query 'device' in tensorflow/models, I could see several C files such as syntaxnet/syntaxnet/unpack_sparse_features.cc contain line using tensorflow::DEVICE_CPU;

So.. is to change C codes in these files the only way to change device configuration for SyntaxNet?

I hope there is a simpler way to change the setting in Python level.

Thanks in advance.

Training with Sejong Treebank corpus

Hello,

I was trying to train using the sejong_treebank.sample file, so I ran the following commands:
$ ./sejong/split.sh
$ ./sejong/c2d.sh
$ ./train_sejong.sh

But had an error (same as one below -- "Assign requires shapes of both tensors to match").
So then, I tried downloading a larger treebank corpus from sejong.or.kr (it seems to be the full version of the sejong_treebank.sample in your repository, but then again I'm not sure...) But, the same thing happened.

My input file (tried both sample and full) is just an long stream of the following in UTF-8, just like your sample Sejong file. Is there somewhere else I need to put this? Or is there something else I need other than saving this as sejong/sejong_treebank.txt.v1 and running the scripts?

; 1993/06/08 19 
(NP     (NP 1993/SN + //SP + 06/SN + //SP + 08/SN)
        (NP 19/SN))
; 엠마누엘 웅가로 / 
(NP     (NP     (NP 엠마누엘/NNP)
                (NP 웅가로/NNP))
        (X //SP))
; 의상서 실내 장식품으로… 
(NP_AJT         (NP_AJT 의상/NNG + 서/JKB)
        (NP_AJT         (NP 실내/NNG)
                (NP_AJT 장식품/NNG + 으로/JKB + …/SE)))
; 디자인 세계 넓혀 
(VP     (NP_OBJ         (NP 디자인/NNG)
                (NP_OBJ 세계/NNG))
        (VP 넓히/VV + 어/EC))
; 프랑스의 세계적인 의상 디자이너 엠마누엘 웅가로가 실내 장식용 직물 디자이너로 나섰다. 
(S      (NP_SBJ         (NP     (NP_MOD 프랑스/NNP + 의/JKG)
                        (NP     (VNP_MOD 세계/NNG + 적/XSN + 이/VCP + ᆫ/ETM)
                                (NP     (NP 의상/NNG)
                                        (NP 디자이너/NNG))))
                (NP_SBJ         (NP 엠마누엘/NNP)
                        (NP_SBJ 웅가로/NNP + 가/JKS)))
        (VP     (NP_AJT         (NP     (NP     (NP 실내/NNG)
                                        (NP 장식/NNG + 용/XSN))
                                (NP 직물/NNG))
                        (NP_AJT 디자이너/NNG + 로/JKB))
                (VP 나서/VV + 었/EP + 다/EF + ./SF)))

Here's the logs with all the verbose options.

andy@andy ~/Downloads/syntaxnet/models/syntaxnet/work $ ./sejong/split.sh  -v -v
+ '[' 0 '!=' 0 ']'
++++ readlink -f ./sejong/split.sh
+++ dirname /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/split.sh
++ readlink -f /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong
+ CDIR=/home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong
+ [[ -f /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/env.sh ]]
+ . /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/env.sh
++ set -o errexit
++ export LC_ALL=ko_KR.UTF-8
++ LC_ALL=ko_KR.UTF-8
++ export LANG=ko_KR.UTF-8
++ LANG=ko_KR.UTF-8
+++++ readlink -f /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/env.sh
++++ dirname /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/env.sh
+++ readlink -f /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong
++ CDIR=/home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong
+++++ readlink -f /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/env.sh
++++ dirname /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/env.sh
+++ readlink -f /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/..
++ PDIR=/home/andy/Downloads/syntaxnet/models/syntaxnet/work
++ python=/usr/bin/python
+ make_calmness
+ exec
+ exec
+ child_verbose='-v -v'
+ '[' '!' -e /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/wdir ']'
+ WDIR=/home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/wdir
+ '[' '!' -e /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/log ']'
+ LDIR=/home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/log
+ /usr/bin/python /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/split.py --mode=0
+ /usr/bin/python /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/split.py --mode=1
+ /usr/bin/python /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/split.py --mode=2
+ close_fd
+ exec


andy@andy ~/Downloads/syntaxnet/models/syntaxnet/work $ ./sejong/c2d.sh  -v -v
+ '[' 0 '!=' 0 ']'
++++ readlink -f ./sejong/c2d.sh
+++ dirname /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/c2d.sh
++ readlink -f /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong
+ CDIR=/home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong
+ [[ -f /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/env.sh ]]
+ . /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/env.sh
++ set -o errexit
++ export LC_ALL=ko_KR.UTF-8
++ LC_ALL=ko_KR.UTF-8
++ export LANG=ko_KR.UTF-8
++ LANG=ko_KR.UTF-8
+++++ readlink -f /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/env.sh
++++ dirname /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/env.sh
+++ readlink -f /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong
++ CDIR=/home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong
+++++ readlink -f /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/env.sh
++++ dirname /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/env.sh
+++ readlink -f /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/..
++ PDIR=/home/andy/Downloads/syntaxnet/models/syntaxnet/work
++ python=/usr/bin/python
+ make_calmness
+ exec
+ exec
+ child_verbose='-v -v'
+ '[' '!' -e /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/wdir ']'
+ WDIR=/home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/wdir
+ '[' '!' -e /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/log ']'
+ LDIR=/home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/log
+ for SET in training tuning test
+ /usr/bin/python /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/c2d.py --mode=0
+ /usr/bin/python /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/c2d.py --mode=1
+ /usr/bin/python /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/align.py
number_of_sent = 0, number_of_sent_skip = 0
+ for SET in training tuning test
+ /usr/bin/python /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/c2d.py --mode=0
+ /usr/bin/python /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/c2d.py --mode=1
+ /usr/bin/python /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/align.py
number_of_sent = 0, number_of_sent_skip = 0
+ for SET in training tuning test
+ /usr/bin/python /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/c2d.py --mode=0
+ /usr/bin/python /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/c2d.py --mode=1
+ /usr/bin/python /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/align.py
number_of_sent = 0, number_of_sent_skip = 0
+ close_fd
+ exec


andy@andy ~/Downloads/syntaxnet/models/syntaxnet/work $ ./train_sejong.sh  -v -v
+ '[' 0 '!=' 0 ']'
++++ readlink -f ./train_sejong.sh
+++ dirname /home/andy/Downloads/syntaxnet/models/syntaxnet/work/train_sejong.sh
++ readlink -f /home/andy/Downloads/syntaxnet/models/syntaxnet/work
+ CDIR=/home/andy/Downloads/syntaxnet/models/syntaxnet/work
++++ readlink -f ./train_sejong.sh
+++ dirname /home/andy/Downloads/syntaxnet/models/syntaxnet/work/train_sejong.sh
++ readlink -f /home/andy/Downloads/syntaxnet/models/syntaxnet/work/..
+ PDIR=/home/andy/Downloads/syntaxnet/models/syntaxnet
+ make_calmness
+ exec
+ exec
+ cd /home/andy/Downloads/syntaxnet/models/syntaxnet
+ python=/usr/bin/python
+ SYNTAXNET_HOME=/home/andy/Downloads/syntaxnet/models/syntaxnet
+ BINDIR=/home/andy/Downloads/syntaxnet/models/syntaxnet/bazel-bin/syntaxnet
+ CONTEXT=/home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/context.pbtxt_p
+ TMP_DIR=/home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/tmp_p/syntaxnet-output
+ mkdir -p /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/tmp_p/syntaxnet-output
+ cat /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/context.pbtxt_p
+ sed s=OUTPATH=/home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/tmp_p/syntaxnet-output=
+ MODEL_DIR=/home/andy/Downloads/syntaxnet/models/syntaxnet/work/models
+ HIDDEN_LAYER_SIZES=512,512
+ HIDDEN_LAYER_PARAMS=512,512
+ BATCH_SIZE=256
+ BEAM_SIZE=16
+ LP_PARAMS=512,512-0.08-4400-0.85
+ GP_PARAMS=512,512-0.02-100-0.9
+ pretrain_parser
+ /home/andy/Downloads/syntaxnet/models/syntaxnet/bazel-bin/syntaxnet/parser_trainer --arg_prefix=brain_parser --batch_size=256 --compute_lexicon --decay_steps=4400 --graph_builder=greedy --hidden_layer_sizes=512,512 --learning_rate=0.08 --momentum=0.85 --beam_size=1 --output_path=/home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/tmp_p/syntaxnet-output --task_context=/home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/tmp_p/syntaxnet-output/context --projectivize_training_set --training_corpus=tagged-training-corpus --tuning_corpus=tagged-tuning-corpus --params=512,512-0.08-4400-0.85 --num_epochs=20 --report_every=100 --checkpoint_every=1000 --logtostderr
INFO:tensorflow:Computing lexicon...
I syntaxnet/lexicon_builder.cc:124] Term maps collected over 0 tokens from 0 documents
I syntaxnet/term_frequency_map.cc:137] Saved 0 terms to /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/tmp_p/syntaxnet-output/word-map.
I syntaxnet/term_frequency_map.cc:137] Saved 0 terms to /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/tmp_p/syntaxnet-output/lcword-map.
I syntaxnet/term_frequency_map.cc:137] Saved 0 terms to /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/tmp_p/syntaxnet-output/tag-map.
I syntaxnet/term_frequency_map.cc:137] Saved 0 terms to /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/tmp_p/syntaxnet-output/category-map.
I syntaxnet/term_frequency_map.cc:137] Saved 0 terms to /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/tmp_p/syntaxnet-output/label-map.
I syntaxnet/term_frequency_map.cc:101] Loaded 0 terms from /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/tmp_p/syntaxnet-output/label-map.
I syntaxnet/embedding_feature_extractor.cc:35] Features: input.word input(1).word input(2).word input(3).word stack.word stack(1).word stack(2).word stack(3).word stack.child(1).word stack.child(1).sibling(-1).word stack.child(-1).word stack.child(-1).sibling(1).word stack(1).child(1).word stack(1).child(1).sibling(-1).word stack(1).child(-1).word stack(1).child(-1).sibling(1).word stack.child(2).word stack.child(-2).word stack(1).child(2).word stack(1).child(-2).word; input.tag input(1).tag input(2).tag input(3).tag stack.tag stack(1).tag stack(2).tag stack(3).tag stack.child(1).tag stack.child(1).sibling(-1).tag stack.child(-1).tag stack.child(-1).sibling(1).tag stack(1).child(1).tag stack(1).child(1).sibling(-1).tag stack(1).child(-1).tag stack(1).child(-1).sibling(1).tag stack.child(2).tag stack.child(-2).tag stack(1).child(2).tag stack(1).child(-2).tag; stack.child(1).label stack.child(1).sibling(-1).label stack.child(-1).label stack.child(-1).sibling(1).label stack(1).child(1).label stack(1).child(1).sibling(-1).label stack(1).child(-1).label stack(1).child(-1).sibling(1).label stack.child(2).label stack.child(-2).label stack(1).child(2).label stack(1).child(-2).label 
I syntaxnet/embedding_feature_extractor.cc:36] Embedding names: words;tags;labels
I syntaxnet/embedding_feature_extractor.cc:37] Embedding dims: 64;32;32
I syntaxnet/term_frequency_map.cc:101] Loaded 0 terms from /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/tmp_p/syntaxnet-output/word-map.
I syntaxnet/term_frequency_map.cc:101] Loaded 0 terms from /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/tmp_p/syntaxnet-output/tag-map.
INFO:tensorflow:Preprocessing...
INFO:tensorflow:Training...
INFO:tensorflow:Building training network with parameters: feature_sizes: [20 20 12] domain_sizes: [3 3 3]
INFO:tensorflow:Initializing...
INFO:tensorflow:Training...
I syntaxnet/embedding_feature_extractor.cc:35] Features: input.word input(1).word input(2).word input(3).word stack.word stack(1).word stack(2).word stack(3).word stack.child(1).word stack.child(1).sibling(-1).word stack.child(-1).word stack.child(-1).sibling(1).word stack(1).child(1).word stack(1).child(1).sibling(-1).word stack(1).child(-1).word stack(1).child(-1).sibling(1).word stack.child(2).word stack.child(-2).word stack(1).child(2).word stack(1).child(-2).word; input.tag input(1).tag input(2).tag input(3).tag stack.tag stack(1).tag stack(2).tag stack(3).tag stack.child(1).tag stack.child(1).sibling(-1).tag stack.child(-1).tag stack.child(-1).sibling(1).tag stack(1).child(1).tag stack(1).child(1).sibling(-1).tag stack(1).child(-1).tag stack(1).child(-1).sibling(1).tag stack.child(2).tag stack.child(-2).tag stack(1).child(2).tag stack(1).child(-2).tag; stack.child(1).label stack.child(1).sibling(-1).label stack.child(-1).label stack.child(-1).sibling(1).label stack(1).child(1).label stack(1).child(1).sibling(-1).label stack(1).child(-1).label stack(1).child(-1).sibling(1).label stack.child(2).label stack.child(-2).label stack(1).child(2).label stack(1).child(-2).label 
I syntaxnet/embedding_feature_extractor.cc:36] Embedding names: words;tags;labels
I syntaxnet/embedding_feature_extractor.cc:37] Embedding dims: 64;32;32
I syntaxnet/term_frequency_map.cc:101] Loaded 0 terms from /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/tmp_p/syntaxnet-output/word-map.
I syntaxnet/term_frequency_map.cc:101] Loaded 0 terms from /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/tmp_p/syntaxnet-output/tag-map.
I syntaxnet/term_frequency_map.cc:101] Loaded 0 terms from /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/tmp_p/syntaxnet-output/label-map.
I syntaxnet/reader_ops.cc:141] Starting epoch 1
I syntaxnet/reader_ops.cc:141] Starting epoch 2
I syntaxnet/reader_ops.cc:141] Starting epoch 3
I syntaxnet/reader_ops.cc:141] Starting epoch 4
I syntaxnet/reader_ops.cc:141] Starting epoch 5
I syntaxnet/reader_ops.cc:141] Starting epoch 6
I syntaxnet/reader_ops.cc:141] Starting epoch 7
I syntaxnet/reader_ops.cc:141] Starting epoch 8
I syntaxnet/reader_ops.cc:141] Starting epoch 9
I syntaxnet/reader_ops.cc:141] Starting epoch 10
I syntaxnet/reader_ops.cc:141] Starting epoch 11
I syntaxnet/reader_ops.cc:141] Starting epoch 12
I syntaxnet/reader_ops.cc:141] Starting epoch 13
I syntaxnet/reader_ops.cc:141] Starting epoch 14
I syntaxnet/reader_ops.cc:141] Starting epoch 15
I syntaxnet/reader_ops.cc:141] Starting epoch 16
I syntaxnet/reader_ops.cc:141] Starting epoch 17
I syntaxnet/reader_ops.cc:141] Starting epoch 18
I syntaxnet/reader_ops.cc:141] Starting epoch 19
I syntaxnet/reader_ops.cc:141] Starting epoch 20
+ evaluate_pretrained_parser
+ for SET in training tuning test
+ /home/andy/Downloads/syntaxnet/models/syntaxnet/bazel-bin/syntaxnet/parser_eval --task_context=/home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/tmp_p/syntaxnet-output/brain_parser/greedy/512,512-0.08-4400-0.85/context --batch_size=256 --hidden_layer_sizes=512,512 --beam_size=1 --input=tagged-training-corpus --output=parsed-training-corpus --arg_prefix=brain_parser --graph_builder=greedy --model_path=/home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/tmp_p/syntaxnet-output/brain_parser/greedy/512,512-0.08-4400-0.85/model
I syntaxnet/term_frequency_map.cc:101] Loaded 0 terms from /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/tmp_p/syntaxnet-output/label-map.
I syntaxnet/embedding_feature_extractor.cc:35] Features: input.word input(1).word input(2).word input(3).word stack.word stack(1).word stack(2).word stack(3).word stack.child(1).word stack.child(1).sibling(-1).word stack.child(-1).word stack.child(-1).sibling(1).word stack(1).child(1).word stack(1).child(1).sibling(-1).word stack(1).child(-1).word stack(1).child(-1).sibling(1).word stack.child(2).word stack.child(-2).word stack(1).child(2).word stack(1).child(-2).word; input.tag input(1).tag input(2).tag input(3).tag stack.tag stack(1).tag stack(2).tag stack(3).tag stack.child(1).tag stack.child(1).sibling(-1).tag stack.child(-1).tag stack.child(-1).sibling(1).tag stack(1).child(1).tag stack(1).child(1).sibling(-1).tag stack(1).child(-1).tag stack(1).child(-1).sibling(1).tag stack.child(2).tag stack.child(-2).tag stack(1).child(2).tag stack(1).child(-2).tag; stack.child(1).label stack.child(1).sibling(-1).label stack.child(-1).label stack.child(-1).sibling(1).label stack(1).child(1).label stack(1).child(1).sibling(-1).label stack(1).child(-1).label stack(1).child(-1).sibling(1).label stack.child(2).label stack.child(-2).label stack(1).child(2).label stack(1).child(-2).label 
I syntaxnet/embedding_feature_extractor.cc:36] Embedding names: words;tags;labels
I syntaxnet/embedding_feature_extractor.cc:37] Embedding dims: 64;32;32
I syntaxnet/term_frequency_map.cc:101] Loaded 0 terms from /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/tmp_p/syntaxnet-output/word-map.
I syntaxnet/term_frequency_map.cc:101] Loaded 0 terms from /home/andy/Downloads/syntaxnet/models/syntaxnet/work/sejong/tmp_p/syntaxnet-output/tag-map.
INFO:tensorflow:Building training network with parameters: feature_sizes: [20 20 12] domain_sizes: [3 3 3]
Traceback (most recent call last):
  File "/home/andy/Downloads/syntaxnet/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/__main__/syntaxnet/parser_eval.py", line 149, in <module>
    tf.app.run()
  File "/home/andy/Downloads/syntaxnet/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/tf/tensorflow/python/platform/app.py", line 30, in run
    sys.exit(main(sys.argv))
  File "/home/andy/Downloads/syntaxnet/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/__main__/syntaxnet/parser_eval.py", line 145, in main
    Eval(sess, num_actions, feature_sizes, domain_sizes, embedding_dims)
  File "/home/andy/Downloads/syntaxnet/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/__main__/syntaxnet/parser_eval.py", line 98, in Eval
    parser.saver.restore(sess, FLAGS.model_path)
  File "/home/andy/Downloads/syntaxnet/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/tf/tensorflow/python/training/saver.py", line 1104, in restore
    {self.saver_def.filename_tensor_name: save_path})
  File "/home/andy/Downloads/syntaxnet/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/tf/tensorflow/python/client/session.py", line 333, in run
    run_metadata_ptr)
  File "/home/andy/Downloads/syntaxnet/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/tf/tensorflow/python/client/session.py", line 573, in _run
    feed_dict_string, options, run_metadata)
  File "/home/andy/Downloads/syntaxnet/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/tf/tensorflow/python/client/session.py", line 653, in _do_run
    target_list, options, run_metadata)
  File "/home/andy/Downloads/syntaxnet/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/tf/tensorflow/python/client/session.py", line 673, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [3,64] rhs shape= [485,64]
     [[Node: save/Assign_5 = Assign[T=DT_FLOAT, _class=["loc:@embedding_matrix_0"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](params/embedding_matrix_0/ExponentialMovingAverage, save/restore_slice_5)]]
Caused by op u'save/Assign_5', defined at:
  File "/home/andy/Downloads/syntaxnet/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/__main__/syntaxnet/parser_eval.py", line 149, in <module>
    tf.app.run()
  File "/home/andy/Downloads/syntaxnet/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/tf/tensorflow/python/platform/app.py", line 30, in run
    sys.exit(main(sys.argv))
  File "/home/andy/Downloads/syntaxnet/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/__main__/syntaxnet/parser_eval.py", line 145, in main
    Eval(sess, num_actions, feature_sizes, domain_sizes, embedding_dims)
  File "/home/andy/Downloads/syntaxnet/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/__main__/syntaxnet/parser_eval.py", line 96, in Eval
    parser.AddSaver(FLAGS.slim_model)
  File "/home/andy/Downloads/syntaxnet/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/__main__/syntaxnet/graph_builder.py", line 568, in AddSaver
    self.saver = tf.train.Saver(variables_to_save)
  File "/home/andy/Downloads/syntaxnet/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/tf/tensorflow/python/training/saver.py", line 845, in __init__
    restore_sequentially=restore_sequentially)
  File "/home/andy/Downloads/syntaxnet/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/tf/tensorflow/python/training/saver.py", line 515, in build
    filename_tensor, vars_to_save, restore_sequentially, reshape)
  File "/home/andy/Downloads/syntaxnet/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/tf/tensorflow/python/training/saver.py", line 281, in _AddRestoreOps
    validate_shape=validate_shape))
  File "/home/andy/Downloads/syntaxnet/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/tf/tensorflow/python/ops/gen_state_ops.py", line 45, in assign
    use_locking=use_locking, name=name)
  File "/home/andy/Downloads/syntaxnet/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/tf/tensorflow/python/ops/op_def_library.py", line 693, in apply_op
    op_def=op_def)
  File "/home/andy/Downloads/syntaxnet/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/tf/tensorflow/python/framework/ops.py", line 2186, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/andy/Downloads/syntaxnet/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/tf/tensorflow/python/framework/ops.py", line 1170, in __init__
    self._traceback = _extract_stack()

Children in c2d.py when converting Sejong Corpus

In c2d.py, find_gov determines 'HEAD' of CoNLL-U format to each node with 4 rules including head final rule, whose children are determined in make_edge.

Could you please explain the usage of lchild and rchild?
It seems like one node can only have two children, and child node is attached to parent node as lchild by default.
But I couldn't understand

what lchild and rchild mean in parsing tree
why one node can have only two children.

Are they leftmost child and rightmost child surrounding inner children?

def find_gov(node) :
    '''
    * node = leaf node

    1. head final rule
      - 현재 node에서 parent를 따라가면서
        첫번째로 right child가 있는 node를 만나면
        해당 node의 right child를 따라서 leaf node까지 이동
    2. VX rule
      - 보조용언을 governor로 갖는다면 본용언으로 바꿔준다. 
      - 보조용언은 아니지만 보조용언처럼 동작하는 용언도 비슷하게 처리한다. ex) '지니게 되다'
    3. VNP rule
      - 'VNP 것/NNB + 이/VCP + 다/EF' 형태를 governor로 갖는다면 앞쪽 용언으로 바꿔준다. 
    4. VA rule
      - '있/VA, 없/VA, 같/VA'가 governor인 경우, 앞쪽에 'ㄹ NNB' 형태가 오면 앞쪽 용언으로 바꿔준다. 
        node['pleaf'] 링크를 활용한다. 
    '''
    # 첫번째로 right child가 있는 node를 탐색
    # sibling link를 활용한다. 
    next = node
    found = None
    while next :
        if next['sibling'] :
            found = next['sibling']['parent']
            break
        next = next['parent']

    gov_node = None
    if found :
        # right child를 따라서 leaf node까지
        next = found
        while next :
            if next['leaf'] :
                gov_node = next
                # -----------------------------------------------------------------
                # gov_node가 vx rule을 만족하는 경우 parent->lchild를 따라간다. 
                if check_vx_rule(gov_node) :
                    new_gov_node = find_for_vx_rule(node, gov_node)
                    if new_gov_node : gov_node = new_gov_node
                # gov_node가 vnp rule을 만족하는 경우 parent->lchild를 따라간다. 
                if check_vnp_rule(gov_node) :
                    new_gov_node = find_for_vnp_rule(node, gov_node)
                    if new_gov_node :
                        gov_node = new_gov_node
                        # 새로운 지배소가 '있다,없다,같다'인 경우 
                        # check_va_rule을 한번 태워본다. 
                        if check_va_rule(gov_node) :
                            new_gov_node = find_for_va_rule(node, gov_node, search_mode=2)
                            if new_gov_node : gov_node = new_gov_node
                # gov_node가 va rule을 만족하는 경우 parent->lchild를 따라간다. 
                if check_va_rule(gov_node) :
                    new_gov_node = find_for_va_rule(node, gov_node, search_mode=1)
                    if new_gov_node : gov_node = new_gov_node
                # -----------------------------------------------------------------
                break
            next = next['rchild']
    if gov_node :
        return gov_node['eoj_idx']
    return 0


def make_edge(top, node) :
    if not top['lchild'] : # link to left child
        top['lchild'] = node
        node['parent'] = top
        if VERBOSE : print node_string(top) + '-[left]->' + node_string(node)
    elif not top['rchild'] : # link to right child
        top['rchild'] = node
        node['parent'] = top
        top['lchild']['sibling'] = node
        if VERBOSE : print node_string(top) + '-[right]->' + node_string(node)
    else :
        return False
    return True

How to use conll2017 baseline ？

Thanks for your great works！I saw your reply on stackoverflow, i know you have built your own system, i have two problemsa about it:

You trianed your model on English, i also trian once. Official offer different language models for conll2017 baselines，i don't konw which are entries for modify scipt to train differnt language models?
Your eval scipt is well , but their README mentioned the baseline_eval.py can't find , do you know where is it?
I am sorry for that my problems maybe not directly related to your models. But those are really important for me，if you know please tell me , thanks very much.

./sejong/c2d.sh error

I had a problem with [training parser from Sejong treebank corpus]

,/sejong/split.sh -v -v is ok

but, ./sejong/c2d.sh -v -v had error

what should i do??

Invalid Save Path

Hi dsindex,

Firstly thank you so much for writing your script! It actually works, unlike the official SyntaxNet documentation.

I'm running ./train.sh -v -v and getting the following error"
INFO:tensorflow:Building training network with parameters: feature_sizes: [8 2 3 3] domain_sizes: [2126 5 1701 2008] Traceback (most recent call last): File "/home/ubuntu/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/syntaxnet/parser_eval.py", line 149, in <module> tf.app.run() File "/home/ubuntu/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/external/tf/tensorflow/python/platform/app.py", line 30, in run sys.exit(main(sys.argv)) File "/home/ubuntu/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/syntaxnet/parser_eval.py", line 145, in main Eval(sess, num_actions, feature_sizes, domain_sizes, embedding_dims) File "/home/ubuntu/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/syntaxnet/parser_eval.py", line 98, in Eval parser.saver.restore(sess, FLAGS.model_path) File "/home/ubuntu/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/external/tf/tensorflow/python/training/saver.py", line 1102, in restore raise ValueError("Restore called with invalid save path %s" % save_path) ValueError: Restore called with invalid save path /home/ubuntu/models/syntaxnet/work/UD_English/tmp/syntaxnet-output/brain_pos/greedy/128-0.08-3600-0.9-0/model

For reference, I replaced UD_English/*.conll with shorter files (500 sentences instead of 10000+) so I could train faster, and because my next step is training SyntaxNet on a Twitter treebank (even smaller than 500 sentences for train, tune, and dev).

Thanks in advance for your help!

Check failed: input.part_size() == 1 (0 vs. 1)

I have modified CDIR and PID as absolute paths in Mac OS X, but when run parser_trainer_test.sh there is an error arises below

(tensorflow) zhaoweideMacBook-Pro:work zhaowei$ ./parser_trainer_test.sh

CDIR=/Users/zhaowei/tensorflow/models/syntaxnet/work
PDIR=/Users/zhaowei/tensorflow/models/syntaxnet
cd /Users/zhaowei/tensorflow/models/syntaxnet
SYNTAXNET_HOME=/Users/zhaowei/tensorflow/models/syntaxnet
BINDIR=/Users/zhaowei/tensorflow/models/syntaxnet/bazel-bin/syntaxnet
CONTEXT=/Users/zhaowei/tensorflow/models/syntaxnet/work/testdata/context.pbtxt
TMP_DIR=/Users/zhaowei/tensorflow/models/syntaxnet/work/testdata/tmp/syntaxnet-output
mkdir -p /Users/zhaowei/tensorflow/models/syntaxnet/work/testdata/tmp/syntaxnet-output
cat /Users/zhaowei/tensorflow/models/syntaxnet/work/testdata/context.pbtxt
sed s=OUTPATH=/Users/zhaowei/tensorflow/models/syntaxnet/work/testdata/tmp/syntaxnet-output=
PARAMS=128-0.08-3600-0.9-0
/Users/zhaowei/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_trainer --arg_prefix=brain_parser --batch_size=32 --compute_lexicon --decay_steps=3600 --graph_builder=greedy --hidden_layer_sizes=128 --learning_rate=0.08 --momentum=0.9 --output_path=/Users/zhaowei/tensorflow/models/syntaxnet/work/testdata/tmp/syntaxnet-output --task_context=/Users/zhaowei/tensorflow/models/syntaxnet/work/testdata/tmp/syntaxnet-output/context --training_corpus=training-corpus --tuning_corpus=tuning-corpus --params=128-0.08-3600-0.9-0 --num_epochs=12 --report_every=100 --checkpoint_every=1000 --logtostderr
INFO:tensorflow:Computing lexicon...
I syntaxnet/lexicon_builder.cc:134] Term maps collected over 964 tokens from 54 documents
I syntaxnet/term_frequency_map.cc:139] Saved 547 terms to /Users/zhaowei/tensorflow/models/syntaxnet/work/testdata/tmp/syntaxnet-output/word-map.
I syntaxnet/term_frequency_map.cc:139] Saved 529 terms to /Users/zhaowei/tensorflow/models/syntaxnet/work/testdata/tmp/syntaxnet-output/lcword-map.
I syntaxnet/term_frequency_map.cc:139] Saved 36 terms to /Users/zhaowei/tensorflow/models/syntaxnet/work/testdata/tmp/syntaxnet-output/tag-map.
I syntaxnet/term_frequency_map.cc:139] Saved 36 terms to /Users/zhaowei/tensorflow/models/syntaxnet/work/testdata/tmp/syntaxnet-output/category-map.
I syntaxnet/term_frequency_map.cc:139] Saved 39 terms to /Users/zhaowei/tensorflow/models/syntaxnet/work/testdata/tmp/syntaxnet-output/label-map.
F syntaxnet/task_context.cc:140] Check failed: input.part_size() == 1 (0 vs. 1)char-map
./parser_trainer_test.sh: line 38: 3247 Abort trap: 6 "$BINDIR/parser_trainer" --arg_prefix=brain_parser --batch_size=32 --compute_lexicon --decay_steps=3600 --graph_builder=greedy --hidden_layer_sizes=128 --learning_rate=0.08 --momentum=0.9 --output_path=$TMP_DIR --task_context=$TMP_DIR/context --training_corpus=training-corpus --tuning_corpus=tuning-corpus --params=$PARAMS --num_epochs=12 --report_every=100 --checkpoint_every=1000 --logtostderr

Why does it continue its training?

I didn't know why does it continue its training...
n-steps is 100000 but it keeps training over 160000
I did not modify any code and have completed the syntaxnet training perfectly.
What should I do?

Question: train dragnn using dragnn example?

Hi,

I am updated syntaxnet/examples/dragnn/trainer_tutorial.ipynb to train english using official tensorflow/syntaxnet docker image.

docker run -it -p 8888:8888 -v ~/ud-treebanks-conll2017/UD_English:/UD_Eng tensorflow/syntaxnet
open "http://localhost:8888/?token=xxxx" in the browser
open dragnn/trainer_tutorial.ipynb and change it to train english, for example:

DATA_DIR = '/UD_Eng'
TENSORBOARD_DIR = '/notebooks/tensorboard'
CHECKPOINT_FILENAME = '{}/eng-checkpoint'.format(DATA_DIR)
TRAINING_CORPUS_PATH = '{}/en-ud-train.conllu'.format(DATA_DIR)
DEV_CORPUS_PATH = '{}/en-ud-dev.conllu'.format(DATA_DIR)

then change the text to english to validate
Visualize the output of our mini-trained model on a test sentence.

text = 'go to USA'

then run the script in the notebook web application. The training is done, but parsing result is incorrect.

So question:can I can use the script trainer_tutorial.ipynb for training? The script seems much simpler than this github. How to modify it to run training sucessfully?

Thanks.

how to find sejong_treebank.txt.v1 ??

Hello dsindex.
I try to test your sample, but I can't found sejong_treebank.txt.v1.
where I find this file?

Using the train.sh script to train on UD_Norwegian dataset

I'm trying to train a model for the UD_Norwegian dataset. When I run the train.sh on my mac it runs for a few hours, and then it just stops when doing the beam search. In the syntaxnet-output folder I can see that the structured folder has some data in it in the status file, but is seems to have just stopped at one point, and I don't get the analysis that should be output from your train.sh file. This is the status file output:

Parameters: 200x200-0.02-100-0.9-0 | Steps: 5000 | Tuning score: 85.00% | Best tuning score: 85.00%
Parameters: 200x200-0.02-100-0.9-0 | Steps: 10000 | Tuning score: 85.05% | Best tuning score: 85.05%
Parameters: 200x200-0.02-100-0.9-0 | Steps: 15000 | Tuning score: 85.02% | Best tuning score: 85.05%
Parameters: 200x200-0.02-100-0.9-0 | Steps: 20000 | Tuning score: 84.94% | Best tuning score: 85.05%

Is the tuning score the accuracy of the dependency parser?

I read in one of your comments somewhere (can't recall where) that it took you approximately 1 day to run the training?

stuck in second step of "Preprocessing with the Tagger"

Thank you very much for the tutorial. But I have the following problems with the Mac OSX operating system, I stuck in second step of "Preprocessing with the Tagger". Which in your train.sh is preprocess_with_tagger step.

I have completed the first step that is "Training the SyntaxNet POS Tagger", and generated directory models,

My directory structure is as follows:

--models/syntaxnet
-- english/ # Download from Universal Dependencies and use UD_English
---- en-ud-dev.conllu
---- en-ud-test.conllu
---- en-ud-train.conllu
-- trainer.sh # Training the SyntaxNet POS Tagger script
-- preprocessingWithTagger.sh
-- syntaxnet/context.pbtxt # modified params of the file_pattern flag training-corpus, tuning-corpus, dev-corpus correspond to english/en-ud-train.conllu.conv, english/en-ud-test.conllu.conv, english/en-ud-dev.conllu.conv

The script about trainer.sh:

#!/bin/bash

bazel-bin/syntaxnet/parser_trainer \
      --task_context=syntaxnet/context.pbtxt \
      --arg_prefix=brain_pos \
      --compute_lexicon \
      --graph_builder=greedy \
      --training_corpus=training-corpus \
      --tuning_corpus=tuning-corpus \
      --output_path=models \
      --batch_size=256 \
      --decay_steps=3600 \
      --hidden_layer_sizes=128 \
      --learning_rate=0.08 \
      --momentum=0.9 \
      --seed=0 \
      --params=128-0.08-3600-0.9-0

This script looks normal when it runs, and it generates a directory named models:
models/
models//brain_pos
models//brain_pos/greedy
models//brain_pos/greedy/128-0.08-3600-0.9-0
models//brain_pos/greedy/128-0.08-3600-0.9-0/category-map
models//brain_pos/greedy/128-0.08-3600-0.9-0/char-map
models//brain_pos/greedy/128-0.08-3600-0.9-0/checkpoint
models//brain_pos/greedy/128-0.08-3600-0.9-0/context
models//brain_pos/greedy/128-0.08-3600-0.9-0/graph
models//brain_pos/greedy/128-0.08-3600-0.9-0/label-map
models//brain_pos/greedy/128-0.08-3600-0.9-0/latest-model
models//brain_pos/greedy/128-0.08-3600-0.9-0/latest-model.meta
models//brain_pos/greedy/128-0.08-3600-0.9-0/lcword-map
models//brain_pos/greedy/128-0.08-3600-0.9-0/model
models//brain_pos/greedy/128-0.08-3600-0.9-0/model.meta
models//brain_pos/greedy/128-0.08-3600-0.9-0/prefix-table
models//brain_pos/greedy/128-0.08-3600-0.9-0/status
models//brain_pos/greedy/128-0.08-3600-0.9-0/suffix-table
models//brain_pos/greedy/128-0.08-3600-0.9-0/tag-map
models//brain_pos/greedy/128-0.08-3600-0.9-0/tag-to-category
models//brain_pos/greedy/128-0.08-3600-0.9-0/word-map

next I run the script preprocessingWithTagger.sh,

#!/bin/bash

PARAMS=128-0.08-3600-0.9-0
for SET in training tuning dev; do
  bazel-bin/syntaxnet/parser_eval \
    --task_context=models/brain_pos/greedy/$PARAMS/context \
    --hidden_layer_sizes=128 \
    --input=$SET-corpus \
    --output=tagged-$SET-corpus \
    --arg_prefix=brain_pos \
    --graph_builder=greedy \
    --model_path=models/brain_pos/greedy/$PARAMS/model
done

console show the above error:
(tensorflow) zhaoweideMacBook-Pro:syntaxnet zhaowei$ ./preprocessingWithTagger.sh

Part of error code below:

Caused by op u'save/Assign_15', defined at:
File "/Users/zhaowei/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/main/syntaxnet/parser_eval.py", line 161, in
tf.app.run()
File "/Users/zhaowei/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "/Users/zhaowei/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/main/syntaxnet/parser_eval.py", line 157, in main
Eval(sess)
File "/Users/zhaowei/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/main/syntaxnet/parser_eval.py", line 113, in Eval
parser.AddSaver(FLAGS.slim_model)
File "/Users/zhaowei/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/main/syntaxnet/graph_builder.py", line 568, in AddSaver
self.saver = tf.train.Saver(variables_to_save)
File "/Users/zhaowei/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/training/saver.py", line 1078, in init
self.build()
File "/Users/zhaowei/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/training/saver.py", line 1107, in build
restore_sequentially=self.restore_sequentially)
File "/Users/zhaowei/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/training/saver.py", line 705, in build
restore_sequentially, reshape)
File "/Users/zhaowei/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/training/saver.py", line 454, in AddRestoreOps
assign_ops.append(saveable.restore(tensors, shapes))
File "/Users/zhaowei/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/training/saver.py", line 211, in restore
self.op.get_shape().is_fully_defined())
File "/Users/zhaowei/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/ops/gen_state_ops.py", line 45, in assign
use_locking=use_locking, name=name)
File "/Users/zhaowei/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/framework/op_def_library.py", line 749, in apply_op
op_def=op_def)
File "/Users/zhaowei/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/framework/ops.py", line 2386, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/Users/zhaowei/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/framework/ops.py", line 1298, in __init
self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [32,50] rhs shape= [256,50]
[[Node: save/Assign_15 = AssignT=DT_FLOAT, _class=["loc:@transition_scores"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"]]

How to train Chinese tokenizer using Syntaxnet?

Hi , I have trained my Chinese model refering to how to train English model. But how can I make Chinese tokenizer? When I test my Chinese model with MODEL_DIRECTORY=/xy/models/syntaxnet/Chinesemodel
echo '這樣的處理也衍生了一些問題.' | ./parse.sh The result is just all the sentence as a noun.

Serving different language model #2 - Export

Hello @dsindex
Thank you for bringing up this project, I believe it's magic to have it.

Struggling to put things together to serve Croatian model from parsey's cousins on tensorflow serving. (Let's call it "Different-Lang" model). I have reached the point where the trained model needs to be exported, but I cannot get my head around it. Both models_tf in serving and syntaxnet for training was checkout at a4b7bb9a5dd2c021edcd3d68d326255c734d0ef0 .

First things first, I built TF serving as described.

git checkout 89e9dfbea055027bc31878ee8da66b54a701a746
git submodule update --init --recursive
cd models_tf 
git checkout a4b7bb9a5dd2c021edcd3d68d326255c734d0ef0
...
cd serving
bazel --output_user_root=bazel_root build --nocheck_visibility -c opt -s //tensorflow_serving/example:parsey_api --genrule_strategy=standalone --spawn_strategy=standalone --verbose_failures

Next, built SyntaxNet for CPU on a different machine, like a separate project

cd syntaxnet/tensorflow
./configure
cd ..
bazel test syntaxnet/... util/utf8/...

Test Parser Trainer

./parser_trainer_test.sh
# INFO:tensorflow:Seconds elapsed in evaluation: 0.30, eval metric: 89.83%
# + echo PASS
# PASS

Download UD and train Different-Lang model

(downloading ud-treebanks-v2.0.tgz)
# copy UD_Different-Lang folder to “work” folder including there 2 files : ...-ud-dev.conllu / ...-ud-train.conllu. There is no test set in UD corpus v.2.0, so I substitute it with dev.

# add the context.pbtxt and update file location value, file names “...-ud-...” + record-format to "different-lang-text"
./train.sh -v -v

So now I ended up having a project with TF serving (produced at step 1), and a separate project where I trained the model. This latter project has the following structure of directories:
├── autoencoder
├── inception
├── namignizer
├── neural_gpu
├── swivel
├── syntaxnet
│   ├── syntaxnet
│   │   ├── models
│   │   │   └── parsey_mcparseface
│   │   ├── ops
│   │   └── testdata
│   ├── tensorflow
│   │   ├── tools
│   │   └── util
│   │   └── python
│   │   ├── python_include -> /usr/include/python2.7
│   │   └── python_lib -> /usr/lib/python2.7/dist-packages
│   ├── third_party
│   │   └── utf
│   ├── tools
│   ├── util
│   │   └── utf8
│   └── work
│   ├── api
│   │   ├── parsey_client
│   │   │   └── api
│   │   │   ├── cali
│   │   │   │   └── nlp
│   │   │   └── syntaxnet
│   │   └── parsey_model
│   │   └── assets
│   ├── corpus
│   │   └── ud-treebanks-v2.0
│   ├── English
│   ├── models
│   ├── models_sejong
│   ├── sejong
│   ├── testdata
│   │   └── tmp
│   │   └── syntaxnet-output
│   │   └── brain_parser
│   │   ├── greedy
│   │   │   └── 128-0.08-3600-0.9-0
│   │   └── structured
│   │   └── 128-0.001-3600-0.9-0
│   ├── UD_Different-Lang
│   │   └── tmp
│   │   └── syntaxnet-output
│   │   ├── brain_parser
│   │   │   ├── greedy
│   │   │   │   └── 512x512-0.08-4400-0.85-4
│   │   │   └── structured
│   │   │   └── 512x512-0.02-100-0.9-0
│   │   └── brain_pos
│   │   └── greedy
│   │   └── 64-0.08-3600-0.9-0
│   └── UD_English
└── transformer
└── data

Also completed these steps:

$ cp ../api/parsey_mcparseface.py tensorflow_serving/example
$ bazel --output_user_root=bazel_root build --nocheck_visibility -c opt -s //tensorflow_serving/example:parsey_mcparseface --genrule_strategy=standalone --spawn_strategy=standalone --verbose_failures
$ ls bazel-bin/tensorflow_serving/example/parsey_mcparseface

Then I simply copy and paste UD_Different-Lang folder with all trained results to TF Serving work folder side by side with UD_English and set path:

$ cat ../models/context.pbtxt.template | sed "s=OUTPATH=/home/alina/work/UD_Different-Lang/tmp/syntaxnet-output/brain_pos/greedy/64-0.08-3600-0.9-0=" > ../models/context.pbtxt
$ bazel-bin/tensorflow_serving/example/parsey_mcparseface --model_dir=../models --export_path=exported

However, this produced an error. Where did I get it wrong? Thank you!
bazel-bin/tensorflow_serving/example/parsey_mcparseface --model_dir=../models --export_path=exported I external/syntaxnet/syntaxnet/term_frequency_map.cc:101] Loaded 41 terms from /home/alina/work/UD_Croatian/tmp/syntaxnet-output/brain_pos/greedy/64-0.08-3600-0.9-0/label-map. I external/syntaxnet/syntaxnet/embedding_feature_extractor.cc:35] Features: I external/syntaxnet/syntaxnet/embedding_feature_extractor.cc:36] Embedding names: I external/syntaxnet/syntaxnet/embedding_feature_extractor.cc:37] Embedding dims: I external/syntaxnet/syntaxnet/term_frequency_map.cc:101] Loaded 41 terms from /home/alina/work/UD_Croatian/tmp/syntaxnet-output/brain_pos/greedy/64-0.08-3600-0.9-0/label-map. I external/syntaxnet/syntaxnet/embedding_feature_extractor.cc:35] Features: input.word input(1).word input(2).word input(3).word stack.word stack(1).word stack(2).word stack(3).word stack.child(1).word stack.child(1).sibling(-1).word stack.child(-1).word stack.child(-1).sibling(1).word stack(1).child(1).word stack(1).child(1).sibling(-1).word stack(1).child(-1).word stack(1).child(-1).sibling(1).word stack.child(2).word stack.child(-2).word stack(1).child(2).word stack(1).child(-2).word;input.tag input(1).tag input(2).tag input(3).tag stack.tag stack(1).tag stack(2).tag stack(3).tag stack.child(1).tag stack.child(1).sibling(-1).tag stack.child(-1).tag stack.child(-1).sibling(1).tag stack(1).child(1).tag stack(1).child(1).sibling(-1).tag stack(1).child(-1).tag stack(1).child(-1).sibling(1).tag stack.child(2).tag stack.child(-2).tag stack(1).child(2).tag stack(1).child(-2).tag;stack.child(1).label stack.child(1).sibling(-1).label stack.child(-1).label stack.child(-1).sibling(1).label stack(1).child(1).label stack(1).child(1).sibling(-1).label stack(1).child(-1).label stack(1).child(-1).sibling(1).label stack.child(2).label stack.child(-2).label stack(1).child(2).label stack(1).child(-2).label I external/syntaxnet/syntaxnet/embedding_feature_extractor.cc:36] Embedding names: words;tags;labels I external/syntaxnet/syntaxnet/embedding_feature_extractor.cc:37] Embedding dims: 64;32;32 I external/syntaxnet/syntaxnet/term_frequency_map.cc:101] Loaded 34340 terms from /home/alina/work/UD_Croatian/tmp/syntaxnet-output/brain_pos/greedy/64-0.08-3600-0.9-0/word-map. I external/syntaxnet/syntaxnet/term_frequency_map.cc:101] Loaded 17 terms from /home/alina/work/UD_Croatian/tmp/syntaxnet-output/brain_pos/greedy/64-0.08-3600-0.9-0/tag-map. Traceback (most recent call last): File "/home/alina/work/serving/bazel-bin/tensorflow_serving/example/parsey_mcparseface.runfiles/tf_serving/tensorflow_serving/example/parsey_mcparseface.py", line 188, in <module> tf.app.run() File "/home/alina/work/serving/bazel-bin/tensorflow_serving/example/parsey_mcparseface.runfiles/tf_serving/external/org_tensorflow/tensorflow/python/platform/app.py", line 30, in run sys.exit(main(sys.argv)) File "/home/alina/work/serving/bazel-bin/tensorflow_serving/example/parsey_mcparseface.runfiles/tf_serving/tensorflow_serving/example/parsey_mcparseface.py", line 172, in main model[prefix]["documents"] = Build(sess, source, model[prefix]) File "/home/alina/work/serving/bazel-bin/tensorflow_serving/example/parsey_mcparseface.runfiles/tf_serving/tensorflow_serving/example/parsey_mcparseface.py", line 75, in Build document_source=document_source) File "/home/alina/work/serving/bazel-bin/tensorflow_serving/example/parsey_mcparseface.runfiles/tf_serving/external/syntaxnet/syntaxnet/structured_graph_builder.py", line 242, in AddEvaluation document_source=document_source)) File "/home/alina/work/serving/bazel-bin/tensorflow_serving/example/parsey_mcparseface.runfiles/tf_serving/external/syntaxnet/syntaxnet/structured_graph_builder.py", line 100, in _AddBeamReader documents_from_input=documents_from_input) File "/home/alina/work/serving/bazel-bin/tensorflow_serving/example/parsey_mcparseface.runfiles/tf_serving/external/syntaxnet/syntaxnet/ops/gen_parser_ops.py", line 100, in beam_parse_reader name=name) File "/home/alina/work/serving/bazel-bin/tensorflow_serving/example/parsey_mcparseface.runfiles/tf_serving/external/org_tensorflow/tensorflow/python/framework/op_def_library.py", line 627, in apply_op (key, op_type_name, attr_value.i, attr_def.minimum)) ValueError: Attr 'feature_size' of 'BeamParseReader' Op passed 0 less than minimum 1.

How to train Chinese corpus after downloading the universal-dependencies-2.0 ?

thank you so much !

I couldn't find the method.

Test error after successfully training with ./train.sh -v -v

I downloaded your git repo as work directory and successfully trained the model with UD_English with ./train.sh -v -v command.

root@19a0b2aad139:~/models/syntaxnet# git clone https://github.com/j-min/syntaxnet_easy_sejong work
Cloning into 'work'...
remote: Counting objects: 631, done.
remote: Compressing objects: 100% (7/7), done.
remote: Total 631 (delta 1), reused 0 (delta 0), pack-reused 624
Receiving objects: 100% (631/631), 109.51 MiB | 6.67 MiB/s, done.
Resolving deltas: 100% (375/375), done.
Checking connectivity... done.
root@19a0b2aad139:~/models/syntaxnet# cd work
root@19a0b2aad139:~/models/syntaxnet/work# vim train.sh
root@19a0b2aad139:~/models/syntaxnet/work# ./train.sh -v -v
+ '[' 0 '!=' 0 ']'
++++ readlink -f ./train.sh
+++ dirname /root/models/syntaxnet/work/train.sh
++ readlink -f /root/models/syntaxnet/work
+ CDIR=/root/models/syntaxnet/work
++++ readlink -f ./train.sh
+++ dirname /root/models/syntaxnet/work/train.sh
++ readlink -f /root/models/syntaxnet/work/..
+ PDIR=/root/models/syntaxnet
+ make_calmness
+ exec
+ exec
+ cd /root/models/syntaxnet
+ python=/usr/bin/python
+ SYNTAXNET_HOME=/root/models/syntaxnet
+ BINDIR=/root/models/syntaxnet/bazel-bin/syntaxnet
+ CORPUS_DIR=/root/models/syntaxnet/work/UD_English
+ CONTEXT=/root/models/syntaxnet/work/UD_English/context.pbtxt
+ TMP_DIR=/root/models/syntaxnet/work/UD_English/tmp/syntaxnet-output
+ MODEL_DIR=/root/models/syntaxnet/work/models

However, when I try to evaluate the newly trained the model with
"please parse this." | ./test.sh
It ouputs error messages as below.

root@19a0b2aad139:~/models/syntaxnet/work# "please parse this." | ./test.sh
bash: please parse this.: command not found
F syntaxnet/term_frequency_map.cc:62] Check failed: ::tensorflow::Status::OK() == (tensorflow::Env::Default()->NewRandomAccessFile(filename, &file)) (OK vs. Not found: label-map)
F syntaxnet/term_frequency_map.cc:62] Check failed: ::tensorflow::Status::OK() == (tensorflow::Env::Default()->NewRandomAccessFile(filename, &file)) (OK vs. Not found: label-map)

I trained the model twice but it gave me the same error message.

Do you have any idea related to
F syntaxnet/term_frequency_map.cc:62] Check failed: ::tensorflow::Status::OK() == (tensorflow::Env::Default()->NewRandomAccessFile(filename, &file)) (OK vs. Not found: label-map)

Question: missing segmenter in dragnn model

I run dragnn training script, the training result missing "segmenter" directory compared to pre-trained ParseySaurus model, which is available here.
https://drive.google.com/file/d/0BxpbZGYVZsEeSFdrUnBNMUp1YzQ/view?usp=sharing

the directory "segmenter" is must for the script https://github.com/tensorflow/models/blob/master/research/syntaxnet/examples/dragnn/interactive_text_analyzer.ipynb.

Do you how to generate "segmenter" in the model?

Launch server with different model

Thank you for the great tutorial!

I am able to launch

./bazel-bin/tensorflow_serving/example/parsey_api --port=9000 ../api/parsey_model

and it works fine with English. What changes do I need to make to start the server with a Russian model? I have tried to change files in ../api/parsey_model with no result.

How to retrain existing Syntaxnet model?

Is there a way to retrain the syntaxnet POS tagger model with new dataset?

Can't resolve training syntaxnet with custom corpora.

I want to train syntaxnet model with custom corpora ( let's say EN from universal dependencies ).

I have done following changes:

I have changed context.pbtxt by adding
input { name: 'training-corpus' record_format: 'conll-sentence' Part { file_pattern: '/home/prakhar/Downloads/ud-treebanks-v1.3/UD_English/en-ud-train.conllu' } } input { name: 'tuning-corpus' record_format: 'conll-sentence' Part { file_pattern: '/home/prakhar/Downloads/ud-treebanks-v1.3/UD_English/en-ud-dev.conllu' } } input { name: 'dev-corpus' record_format: 'conll-sentence' Part { file_pattern: '/home/prakhar/Downloads/ud-treebanks-v1.3/UD_English/en-ud-test.conllu' } }
Also, I have made a file with the name tagger.sh and copied below contents to it
bazel-bin/syntaxnet/parser_trainer \ --task_context=syntaxnet/context.pbtxt \ --arg_prefix=brain_pos \ # read from POS configuration --compute_lexicon \ # required for first stage of pipeline --graph_builder=greedy \ # no beam search --training_corpus=training-corpus \ # names of training/tuning set --tuning_corpus=tuning-corpus \ --output_path=models \ # where to save new resources --batch_size=32 \ # Hyper-parameters --decay_steps=3600 \ --hidden_layer_sizes=128 \ --learning_rate=0.08 \ --momentum=0.9 \ --seed=0 \ --params=128-0.08-3600-0.9-0 # name for these parameters

How do I train it now ? I have tried ./tagger.sh but says Permission denied.
Do I need to change anything else anywhere ?

GPU device not visible

Hi
I tried to train my own model using './train_dragnn.sh -v -v &' but it doesn't use GPU device.
I added this code in train_dragnn.py and it is only cpu device available:

from tensorflow.python.client import device_lib
print device_lib.list_local_devices()

Tensorflow compiled with GPU support and when i create new python file i can list devices:

import tensorflow as tf
from tensorflow.python.client import device_lib
print device_lib.list_local_devices()
Output:
[name: "/cpu:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 3558873585769960542
, name: "/gpu:0"
device_type: "GPU"
memory_limit: 10574476084
locality {
bus_id: 1
}
incarnation: 8461732259736364275
physical_device_desc: "device: 0, name: Graphics Device, pci bus id: 0000:01:00.0"
]

So, maybe you can suggest me how to use my gpu in training process or maybe i need to put some flags to enable gpu support.

Regards,
Vladimir

Problem running serving

I was following https://github.com/dsindex/syntaxnet/blob/master/README_api.md and got an error at this step:

$ bazel --output_user_root=bazel_root build --nocheck_visibility -c opt -s //tensorflow_serving/example:parsey_api --genrule_strategy=standalone --spawn_strategy=standalone --verbose_failures



Extracting Bazel installation...
.........
ERROR: com.google.devtools.build.lib.packages.BuildFileContainsErrorsException: error loading package '': Extension file not found. Unable to load package for '//third_party/gpus:cuda_configure.bzl': BUILD file not found on package path.
INFO: Elapsed time: 1.905s

Cannot train POS on another corpus ...

Hi !
First of all, i would like to thank you for this great tool and convert file you provide, it works just great.

But i am facing some issues with the french corpua.

could you please correct / complete my understanding of the configuration activities required for training on a another copus ? :

Create a new folder in work (in my example UD_French) with 3 files : *-ud-dev.conllu / *-ud-test.conllu / **-ud-train.conllu
Add the context.pbtxt and update file location value + record-format to "french-text"
Update train.sh with correct file location value
Run train.sh

Than I am stuck with that error :

File "/home/baduel/models/syntaxnet/bazel-bin/syntaxnet/parser_trainer.runfiles/external/tf/tensorflow/python/client/session.py", line 673, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.InvalidArgumentError: indices[0] = -1 is not in [0, 1)
     [[Node: training/embedding_lookup_4 = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, _class=["loc:@training/Diag"], validate_indices=true, _device="/job:localhost/replica:0/task:0/cpu:0"](training/Diag, training/gold_actions)]]
Caused by op u'training/embedding_lookup_4', defined at:
  File "/home/baduel/models/syntaxnet/bazel-bin/syntaxnet/parser_trainer.runfiles/syntaxnet/parser_trainer.py", line 303, in <module>
    tf.app.run()

Any help would be life saving :)

Best regards
Edulba

UD_Italian v.2.0 training OK but test KO

Hi there,
I've just finished to train a new model based on UD_Italian http://ufal.mff.cuni.cz/~zeman/soubory/ud-treebanks-conll2017.tgz corpus with Universal Dependencies release 2.0.
... everything fine but when I run the ./test.sh , the script do not find the models:

"File path is: %r" % (save_path, file_path))
ValueError: Restore called with invalid save path: '/opt/tensorflow/models/syntaxnet/work/models/tagger-params/model'. File path is: '/opt/tensorflow/models/syntaxnet/work/models/tagger-params/model'

here the complete log:

so I tried to rename the models to let it found

/opt/tensorflow/models/syntaxnet/work/models/tagger-params/model.meta (the file I found in the directory) to -> /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model

/opt/tensorflow/models/syntaxnet/work/models/parser-params/model.meta (the file I found in the directory) to -> /opt/tensorflow/models/syntaxnet/work/models/parser-params/model

and then run again the ./test.sh
this time found the models but the error is different..

Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

here the complete log:

root@8d5f1d0de6eb:/opt/tensorflow/models/syntaxnet/work# echo "prova in Italiano" | ./test.sh
I syntaxnet/term_frequency_map.cc:103] Loaded 44 terms from /opt/tensorflow/models/syntaxnet/work/models/label-map.
I syntaxnet/term_frequency_map.cc:103] Loaded 44 terms from /opt/tensorflow/models/syntaxnet/work/models/label-map.
I syntaxnet/embedding_feature_extractor.cc:35] Features: input.word input(1).word input(2).word input(3).word stack.word stack(1).word stack(2).word stack(3).word stack.child(1).word stack.child(1).sibling(-1).word stack.child(-1).word stack.child(-1).sibling(1).word stack(1).child(1).word stack(1).child(1).sibling(-1).word stack(1).child(-1).word stack(1).child(-1).sibling(1).word stack.child(2).word stack.child(-2).word stack(1).child(2).word stack(1).child(-2).word;input.tag input(1).tag input(2).tag input(3).tag stack.tag stack(1).tag stack(2).tag stack(3).tag stack.child(1).tag stack.child(1).sibling(-1).tag stack.child(-1).tag stack.child(-1).sibling(1).tag stack(1).child(1).tag stack(1).child(1).sibling(-1).tag stack(1).child(-1).tag stack(1).child(-1).sibling(1).tag stack.child(2).tag stack.child(-2).tag stack(1).child(2).tag stack(1).child(-2).tag;stack.child(1).label stack.child(1).sibling(-1).label stack.child(-1).label stack.child(-1).sibling(1).label stack(1).child(1).label stack(1).child(1).sibling(-1).label stack(1).child(-1).label stack(1).child(-1).sibling(1).label stack.child(2).label stack.child(-2).label stack(1).child(2).label stack(1).child(-2).label
I syntaxnet/embedding_feature_extractor.cc:36] Embedding names: words;tags;labels
I syntaxnet/embedding_feature_extractor.cc:37] Embedding dims: 64;32;32
I syntaxnet/embedding_feature_extractor.cc:35] Features: stack(3).word stack(2).word stack(1).word stack.word input.word input(1).word input(2).word input(3).word;input.digit input.hyphen;stack.suffix(length=2) input.suffix(length=2) input(1).suffix(length=2);stack.prefix(length=2) input.prefix(length=2) input(1).prefix(length=2)
I syntaxnet/embedding_feature_extractor.cc:36] Embedding names: words;other;suffix;prefix
I syntaxnet/embedding_feature_extractor.cc:37] Embedding dims: 64;4;8;8
I syntaxnet/term_frequency_map.cc:103] Loaded 27170 terms from /opt/tensorflow/models/syntaxnet/work/models/word-map.
I syntaxnet/term_frequency_map.cc:103] Loaded 27170 terms from /opt/tensorflow/models/syntaxnet/work/models/word-map.
I syntaxnet/term_frequency_map.cc:103] Loaded 39 terms from /opt/tensorflow/models/syntaxnet/work/models/tag-map.
I syntaxnet/term_frequency_map.cc:103] Loaded 39 terms from /opt/tensorflow/models/syntaxnet/work/models/tag-map.
INFO:tensorflow:Building training network with parameters: feature_sizes: [20 20 12] domain_sizes: [27173 42 47]
INFO:tensorflow:Building training network with parameters: feature_sizes: [8 2 3 3] domain_sizes: [27173 5 3319 4357]
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:968] Data loss: Unable to open table file /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:968] Data loss: Unable to open table file /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:968] Data loss: Unable to open table file /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:968] Data loss: Unable to open table file /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:968] Data loss: Unable to open table file /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:968] Data loss: Unable to open table file /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:968] Data loss: Unable to open table file /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:968] Data loss: Unable to open table file /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
Traceback (most recent call last):
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/main/syntaxnet/parser_eval.py", line 161, in
tf.app.run()
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/main/syntaxnet/parser_eval.py", line 157, in main
Eval(sess)
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/main/syntaxnet/parser_eval.py", line 115, in Eval
parser.saver.restore(sess, FLAGS.model_path)
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/training/saver.py", line 1437, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 717, in run
run_metadata_ptr)
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 915, in _run
feed_dict_string, options, run_metadata)
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 965, in _do_run
target_list, options, run_metadata)
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 985, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.DataLossError: Unable to open table file /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

Caused by op u'save/RestoreV2', defined at:
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/main/syntaxnet/parser_eval.py", line 161, in
tf.app.run()
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/main/syntaxnet/parser_eval.py", line 157, in main
Eval(sess)
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/main/syntaxnet/parser_eval.py", line 113, in Eval
parser.AddSaver(FLAGS.slim_model)
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/main/syntaxnet/graph_builder.py", line 568, in AddSaver
self.saver = tf.train.Saver(variables_to_save)
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/training/saver.py", line 1078, in init
self.build()
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/training/saver.py", line 1107, in build
restore_sequentially=self._restore_sequentially)
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/training/saver.py", line 705, in build
restore_sequentially, reshape)
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/training/saver.py", line 442, in _AddRestoreOps
tensors = self.restore_op(filename_tensor, saveable, preferred_shard)
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/training/saver.py", line 281, in restore_op
[spec.tensor.dtype])[0])
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/ops/gen_io_ops.py", line 439, in restore_v2
dtypes=dtypes, name=name)
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/framework/op_def_library.py", line 749, in apply_op
op_def=op_def)
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/framework/ops.py", line 2386, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/framework/ops.py", line 1298, in init
self._traceback = _extract_stack()

DataLossError (see above for traceback): Unable to open table file /opt/tensorflow/models/syntaxnet/work/models/tagger-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:968] Data loss: Unable to open table file /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:968] Data loss: Unable to open table file /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:968] Data loss: Unable to open table file /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:968] Data loss: Unable to open table file /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:968] Data loss: Unable to open table file /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:968] Data loss: Unable to open table file /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:968] Data loss: Unable to open table file /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:968] Data loss: Unable to open table file /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:968] Data loss: Unable to open table file /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
Traceback (most recent call last):
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/main/syntaxnet/parser_eval.py", line 161, in
tf.app.run()
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/main/syntaxnet/parser_eval.py", line 157, in main
Eval(sess)
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/main/syntaxnet/parser_eval.py", line 115, in Eval
parser.saver.restore(sess, FLAGS.model_path)
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/training/saver.py", line 1437, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 717, in run
run_metadata_ptr)
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 915, in _run
feed_dict_string, options, run_metadata)
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 965, in _do_run
target_list, options, run_metadata)
File "/opt/tensorflow/models/syntaxnet/bazel-bin/syntaxnet/parser_eval.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 985, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.DataLossError: Unable to open table file /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

DataLossError (see above for traceback): Unable to open table file /opt/tensorflow/models/syntaxnet/work/models/parser-params/model: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

INFO:tensorflow:Read 0 documents`

Any clue?
Thank you so much!

Train error

I'm trying to train UD_English and I want to see output like this. After transform context.pbtxt like context from English and run ./train.sh -v -v I see error:
INFO:tensorflow:Training... INFO:tensorflow:Building training network with parameters: feature_sizes: [] domain_sizes: [] Traceback (most recent call last): File "/home/anton/models/syntaxnet/bazel-bin/syntaxnet/parser_trainer.runfiles/syntaxnet/parser_trainer.py", line 303, in <module> tf.app.run() File "/home/anton/models/syntaxnet/bazel-bin/syntaxnet/parser_trainer.runfiles/external/org_tensorflow/tensorflow/python/platform/app.py", line 30, in run sys.exit(main(sys.argv)) File "/home/anton/models/syntaxnet/bazel-bin/syntaxnet/parser_trainer.runfiles/syntaxnet/parser_trainer.py", line 299, in main Train(sess, num_actions, feature_sizes, domain_sizes, embedding_dims) File "/home/anton/models/syntaxnet/bazel-bin/syntaxnet/parser_trainer.runfiles/syntaxnet/parser_trainer.py", line 212, in Train corpus_name=corpus_name) File "/home/anton/models/syntaxnet/bazel-bin/syntaxnet/parser_trainer.runfiles/syntaxnet/graph_builder.py", line 512, in AddTraining nodes.update(self._AddGoldReader(task_context, batch_size, corpus_name)) File "/home/anton/models/syntaxnet/bazel-bin/syntaxnet/parser_trainer.runfiles/syntaxnet/graph_builder.py", line 381, in _AddGoldReader arg_prefix=self._arg_prefix)) File "/home/anton/models/syntaxnet/bazel-bin/syntaxnet/parser_trainer.runfiles/syntaxnet/ops/gen_parser_ops.py", line 321, in gold_parse_reader arg_prefix=arg_prefix, name=name) File "/home/anton/models/syntaxnet/bazel-bin/syntaxnet/parser_trainer.runfiles/external/org_tensorflow/tensorflow/python/framework/op_def_library.py", line 627, in apply_op (key, op_type_name, attr_value.i, attr_def.minimum)) ValueError: Attr 'feature_size' of 'GoldParseReader' Op passed 0 less than minimum 1.

context2.txt

dsindex / syntaxnet Goto Github PK

syntaxnet's Introduction

syntaxnet

description

history

how to test

univeral dependency corpus

training tagger and parser with another corpus

training parser only

test new model

training parser from Sejong treebank corpus

test korean parser model

apply korean POS tagger(Komoran via konlpy)

tensorflow serving and syntaxnet

parsey's cousins

dragnn

brat annotation tool

comparison to BIST parser

syntaxnet's People

Contributors

Stargazers

Watchers

Forkers

syntaxnet's Issues

Recommend Projects

Recommend Topics

Recommend Org