frankwork / conv_relation Goto Github PK
View Code? Open in Web Editor NEWTensorFlow implementation of Relation Classification via Convolutional Deep Neural Network
TensorFlow implementation of Relation Classification via Convolutional Deep Neural Network
According scorer.pl, they said
In the examples above, the first three files are OK, while the last one contains four errors. And answer_key2.txt contains the true labels for the training dataset.
So the first param of scorer.pl should be network's predictions and the second param should be test_keys.txt
. But in your log files I found this line
!!!WARNING!!! The proposed file contains 1 label(s) of type 'Entity-Destination(e2,e1)', which is NOT present in the key file.
It's seemed to be that you passed test_keys.txt
as the first param and network's predictions as the second param.
您好,直接run运行后报以下错误:
Parent directory of saved_models/cnn-200-50/model.ckpt doesn't exist, can't save.
请问如何解决?
为什么把 lexical reshape 成 6×word_dim?6的含义是什么?
Hello,
I couldn't understand why is rid included when building the tf.train.SequenceExample()
, in the following lines :
rid = raw_example.label
ex.context.feature['rid'].int64_list.value.append(rid)
Does it mean the rid is considered as a feature to train the cnn on ?
Thanks in advance
能提供一下原始的数据集吗?
The position should be calculated for each word in the sentence, relative to the two target entity words.
An example was given in the paper:
People have been moving back into downtown
The correspondent embedding for "moving", regarding to "people" and "downtown" should be:
[WordVec, 3, -3]
What's used is the position of the two entity words. This kinda misses the point of using position embedding as it's supposed to be a means to extract structure features from the sentence.
您好!我想知道您是否将您的模型修改为PCNN,是否有公开的源码。
您好,请问data文件夹下“.cln”结尾的数据集中前五个数字表示什么?是词级别特征中的五个特征么?为什么是数字?
是否有数据预处理的代码?
谢谢
I changed the code to do the validation on a dev set instead of the test set. But when I wanted to test the model on my test set, i got an error when mapping words_to_ids (reader/base.py), this is due to the fact that the vocab.txt file was constructed only on the train and dev data.
Do I need to use three of the train, dev and test sets to construct this vocab.txt file ?
Wouldn't this be a heavy constraint when using the model to predict on new data that we don't know its vocabulary in advance ?
Thanks for your answer in advance.
Hi,
Thanks for your released code. I have run the code and got the Accuracy is 0.779 and no F1-score(In fact the F1-score will be smaller about 4-5% than the accuracy); However the F1-score in the paper is almost 80-82% (except the WordNet lexcial features).
So I wonder that whether there are some tricks in the paper? And have you reach the result in the paper?
Thanks.
Hi, thanks for releasing the source code.
I got an DataLossError error when running the code. The log files show that your code was running fine. So I do not know what happened when running your code on my computer.
Do you have any idea for what happened ?
Thank you very much.
Best,
Dat.
2018-01-12 16:43:37.268221: W tensorflow/core/framework/op_kernel.cc:1192] Data loss: truncated record at 5986508
Traceback (most recent call last):
File "src/train.py", line 170, in
tf.app.run()
File "/Users/dqnguyen/tensorflow/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "src/train.py", line 164, in main
train(sess, m_train, m_valid)
File "src/train.py", line 103, in train
_, loss, acc = sess.run(fetches)
File "/Users/dqnguyen/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 889, in run
run_metadata_ptr)
File "/Users/dqnguyen/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1120, in _run
feed_dict_tensor, options, run_metadata)
File "/Users/dqnguyen/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1317, in _do_run
options, run_metadata)
File "/Users/dqnguyen/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1336, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.DataLossError: truncated record at 5986508
[[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,6], [?], [?,?], [?,?], [?,?]], output_types=[DT_INT64, DT_INT64, DT_INT64, DT_INT64, DT_INT64], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Caused by op u'IteratorGetNext', defined at:
File "src/train.py", line 170, in
tf.app.run()
File "/Users/dqnguyen/tensorflow/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "src/train.py", line 139, in main
train_data, test_data, word_embed = base_reader.inputs()
File "/Users/dqnguyen/workspace/RelationExtraction/conv_relation-master/src/reader/base.py", line 317, in inputs
pad_value, shuffle=True)
File "/Users/dqnguyen/workspace/RelationExtraction/conv_relation-master/src/reader/base.py", line 280, in read_tfrecord_to_batch
batch = iterator.get_next()
File "/Users/dqnguyen/tensorflow/lib/python2.7/site-packages/tensorflow/python/data/ops/iterator_ops.py", line 259, in get_next
name=name))
File "/Users/dqnguyen/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/gen_dataset_ops.py", line 706, in iterator_get_next
output_shapes=output_shapes, name=name)
File "/Users/dqnguyen/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/Users/dqnguyen/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2956, in create_op
op_def=op_def)
File "/Users/dqnguyen/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1470, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
DataLossError (see above for traceback): truncated record at 5986508
[[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[?,6], [?], [?,?], [?,?], [?,?]], output_types=[DT_INT64, DT_INT64, DT_INT64, DT_INT64, DT_INT64], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Hello Frank, @FrankWork
I think you have implemented the model well. However, it seems you have made a mistake because you use test dataset as the validation dataset (which is forbidden absolutely in ML). So, I think the performance you have showed in log files is not exact.
Maybe I have misunderstood your code. Expecting your reply.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.