Code Monkey home page Code Monkey logo

Comments (7)

dsindex avatar dsindex commented on June 13, 2024 1

well it depends. but it is reasonable time compared to GPU. i think you can copy evalutation function from 'train.sh'. and run it independently. or you can calculate accuracy directly from the output parsed corpus of test set. if it is around 84%, every thing was fine.

from syntaxnet.

dsindex avatar dsindex commented on June 13, 2024

@arashsa
i did not test on OS X. so, i guess :

in train.sh, i use 'readlink, dirname' which are not in OS X by default.

install coreutils

brew install coreutils

and replace 'readlink, dirname' to 'greadlink, gdirname'.

then, run ./train.sh -v -v(with verbose mode)

you should get messages like below :

...
INFO:tensorflow:Epochs: 1, num steps: 8000, seconds elapsed: 107.45, avg cost: 0.27,
INFO:tensorflow:Evaluating training network.
I syntaxnet/reader_ops.cc:141] Starting epoch 9
INFO:tensorflow:Seconds elapsed in evaluation: 5.01, eval metric: 80.97%
INFO:tensorflow:Writing out trained parameters.
INFO:tensorflow:Epochs: 1, num steps: 8100, seconds elapsed: 114.21, avg cost: 0.28,
...

these messages was generated from pretrain_parser of train.sh for UD_Norwegian at Ubuntu 14.0.4. the script is still running. after finishing it, i will tell you if there is some situation or not.

from syntaxnet.

dsindex avatar dsindex commented on June 13, 2024

@arashsa

on Ubuntu 14.0.4, everything is fine. and train.sh takes 2~3 hours with GPU support.

structured model accuracy :

INFO:tensorflow:Total processed documents: 1939
INFO:tensorflow:num correct tokens: 25472
INFO:tensorflow:total tokens: 30034
INFO:tensorflow:Seconds elapsed in evaluation: 25.51, eval metric: 84.81%
$ echo "Jeg ser et landskap som er såret og i tilbaketrekning ." | ./test.sh -v -v
Input: Jeg ser et landskap som er såret og i tilbaketrekning .
Parse:
ser VERB ROOT
 +-- Jeg PRON nsubj
 +-- landskap NOUN dobj
 |   +-- et DET det
 |   +-- såret VERB acl:relcl
 |       +-- som PRON nsubj
 |       +-- er VERB cop
 |       +-- og CONJ cc
 |       +-- tilbaketrekning NOUN conj
 |           +-- i ADP case
 +-- . PUNCT punct

from syntaxnet.

arashsa avatar arashsa commented on June 13, 2024

I had actually done the changes that you propose, yet I don't get any output from the train.sh script. However, I have a functioning model, but it is strange that it took only 4 hours to train on CPU?

from syntaxnet.

arashsa avatar arashsa commented on June 13, 2024

Ok, I will try to run the accuracy function from train.sh.

Yes, the accuracy is 84.9%. Do you have any idea how to increase the accuracy? I find the documentation sparse, and I have no idea what to experiment with on the parameters.

from syntaxnet.

dsindex avatar dsindex commented on June 13, 2024

i got 88.7% with Korean corpus. and it is known that state of arts is near 91%.

so i tried to tune hyper parameters including
--hidden_layer_sizes : 200x200, 512x512, ...
--num_epochs : 12, 20, ...
--beam_size : 8, 16, ...

and 'brain_parser_features'

Parameter {
  name: 'brain_parser_embedding_dims'
  value: '64;32;32'
}
Parameter {
  name: 'brain_parser_embedding_names'
  value: 'words;tags;labels'
}
Parameter {
  name: 'brain_parser_features'
  value:
  'input.word '
  'input(1).word '
  'input(2).word '
  'input(3).word '
  ...
  'stack(1).child(-2).word; '
  'input.tag '
  'input(1).tag '
  'input(2).tag '
  'input(3).tag '
  ...  
  'stack(1).child(-2).tag; '
  'stack.child(1).label '
  'stack.child(1).sibling(-1).label '
  'stack.child(-1).label '
  ...
}

but, the effect of those tuning was not impressive even though i did not investigate number of variations.

for English corpus, this situation is same. the accuracy is around 88%. however, the reported accuracy is 94%! where does the difference come from? i couldn't find the way right now ;;

it is worth to read http://www.petrovi.de/data/acl15.pdf

...
"For the word embedding matrix Eword, we
initialized the parameters using pretrained word
embeddings. We used the publicly available
word2vec tool (Mikolov et al., 2013) to learn
CBOW embeddings following the sample configuration
provided with the tool. For words not appearing
in the unsupervised data and the special
“NULL” etc. tokens, we used random initialization.
In preliminary experiments we found no difference
between training the word embeddings on
1 billion or 10 billion tokens. We therefore trained
the word embeddings on the same corpus we used
for tri-training (Chelba et al., 2013).
We set Dword = 64 and Dtag = Dlabel = 32 for
embedding dimensions and M1 = M2 = 2048 hidden
units in our final experiments. 
...
All hyperparameters
(including structure) were tuned using
Section 24 of the WSJ only. When not tri-training,
we used hyperparameters of γ = 0.2, η0 = 0.05,
µ = 0.9, early stopping after roughly 16 hours of
training time. With the tri-training data, we decreased
η0 = 0.05, increased γ = 0.5, and decreased
the size of the network to M1 = 1024,
M2 = 256 for run-time efficiency, and trained the
network for approximately 4 days. For the Treebank
Union setup, we set M1 = M2 = 1024 for the
standard training set and for the tri-training setup"
...

from syntaxnet.

arashsa avatar arashsa commented on June 13, 2024

Thanks for the article.

Ok, so there seems to be a few approaches they use to increase accuracy.

  1. In our approach we start with the basic structure
    of Chen and Manning (2014), but with a deeper architecture
    and improvements to the optimization
    procedure. These modifications (Section 2) increase
    the performance of the greedy model by as
    much as 1%.
  2. By adding 10 million automatically
    parsed tokens to the training data, we improve the
    accuracy of our parsers by almost ∼1.0% on web
    domain data.
  3. our network avoids any recurrent
    structure so as to keep inference fast and
    efficient and to allow the use of simple backpropagation
    to compute gradients

So, getting the parser above 90% seems to be very tricky then. I had hoped that with tuning I could at least reach 90%, and then had to do some extra work to get above 90%. I will look into each of these approaches in the future. If you have some additional ideas I would appreciate your input!

from syntaxnet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.