tdozat / parser-v1 Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
Hi,
I am using the Universal Dependencies dataset for training. I am using Python2.7 and tensorflow 0.8. But I get the following error:
python network.py --config_file config/config.cfg --save_dir saves/mymodel
*** Parser ***
Save directory already exists. Press to overwrite or to exit.
Traceback (most recent call last):
File "network.py", line 372, in
network = Network(model, **cargs)
File "network.py", line 73, in init
self._trainset = Dataset(self.train_file, self._vocabs, model, self._config, name='Trainset')
File "/home/kunal/Parser-v1/dataset.py", line 45, in init
self.rebucket()
File "/home/kunal/Parser-v1/dataset.py", line 121, in rebucket
buff = self._file_iterator.next()
File "/home/kunal/Parser-v1/dataset.py", line 92, in file_iterator
buff = self._process_buff(buff)
File "/home/kunal/Parser-v1/dataset.py", line 104, in process_buff
buff[i][j] = (word,) + words[word] + tags[tag1] + tags[tag2] + (int(head),) + rels[rel]
ValueError: invalid literal for int() with base 10: ''
Is there anything wrong with the files?
Hi Mr Dozat!
There seems to be some slight problem with your current code, which I assume to be because of the version upgrade of Tensorflow. When I pulled your repo and run, it gives me the following error
Traceback (most recent call last):
File "network.py", line 392, in <module>
network = Network(model, **cargs)
File "network.py", line 66, in __init__
self._ops = self._gen_ops()
File "network.py", line 308, in _gen_ops
train_output = self._model(self._trainset)
File "/home/hongmin/work/biaffine_parser/lib/models/parsers/parser.py", line 38, in __call__
top_recur, _ = self.RNN(top_recur)
File "/home/hongmin/work/biaffine_parser/lib/models/nn.py", line 96, in RNN
dtype=tf.float32)
File "/home/hongmin/work/biaffine_parser/lib/models/rnn.py", line 422, in dynamic_bidirectional_rnn
output_fw, output_state_fw = dynamic_rnn(cell_fw, inputs, sequence_length, initial_state_fw, ff_keep_prob, recur_keep_prob, dtype, parallel_iterations, swap_memory, time_major, scope=fw_scope)
File "/home/hongmin/work/biaffine_parser/lib/models/rnn.py", line 540, in dynamic_rnn
[_assert_has_shape(sequence_length, [batch_size])]):
File "/home/hongmin/work/biaffine_parser/lib/models/rnn.py", line 532, in _assert_has_shape
return logging_ops.Assert(
AttributeError: 'module' object has no attribute 'Assert'
I checked the "logging_op.py" it says:
The python wrapper for Assert is in control_flow_ops, as the Assert call relies on certain conditionals for its dependencies. Use_ control_flow_ops.Assert
.
I tried to replace lib/models/rnn.py", line 532, in _assert_has_shap: return logging_ops.Assert
by return control_flow_ops.Assert
, and the error is gone, but futher errors come out.
However, I'm not able to guarantee no effects to other codes. I guess, if you don't mind, just try to debug and make changes to all?
Thanks!
56 while self._splits[i-1] >= self._splits[i] or self._splits[i-1] not in self._len_cntr:
57 self._splits[i-1] -= 1
1.When self._splits[i-1]=14 and self._splits[i]=5(That is possible because now self._splits[i] was obtained in the last loop。 ),line 57 is executed 。Then when self._splits[i-1]=4 and 4 is not in self._len_cntr , 57 is also executed so that self._splits[i-1] is decreasing on.There is no end.
2.What function it has。
When preprocessing the data, there is an infinite loop here: https://github.com/tdozat/Parser/blob/master/lib/etc/k_means.py#L56, since self._splits[i-1]
simply gets decremented indefinitely.
Any idea what might be going wrong? If it helps, I'm using Universal Dependencies 1.3, which seems to be the latest version with formatting supported by this code.
Thanks!
It seems that an older version (<1.0) of TensorFlow was used in your code. Could you specify which version?
Hello,
Do I need to train a english model for myself? there is no ready-made english model for parsing?
Hi,
In my first attempt to train a parser with your code, I faced the following problem (my Tf version is 1.3.0):
Traceback (most recent call last):
File "network.py", line 372, in <module>
network = Network(model, **cargs)
File "network.py", line 77, in __init__
self._ops = self._gen_ops()
File "network.py", line 302, in _gen_ops
train_output = self._model(self._trainset)
File "/proj/mlnlp/rasooli/tools/dozat_parser/Parser/lib/models/parsers/parser.py", line 39, in __call__
embed_inputs = self.embed_concat(word_inputs, tag_inputs)
File "/proj/mlnlp/rasooli/tools/dozat_parser/Parser/lib/models/nn.py", line 65, in embed_concat
noise_shape = tf.pack([tf.shape(word_inputs)[0], tf.shape(word_inputs)[1], 1])
AttributeError: 'module' object has no attribute 'pack'
Seems like your code picks up UPOSTAG field as feature, but according to your paper, it should be XPOSTAG.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.