xiangrongzeng / copy_re Goto Github PK

View Code? Open in Web Editor NEW

195.0 195.0 36.0 47 KB

Release for acl18 paper "Extracting Relational Facts by an End-to-End Neural Model with Copy Mechanism"

Python 100.00%

copy_re's People

Contributors

Stargazers

Watchers

copy_re's Issues

f1/precision/recall

您好！请问我用默认设置训练webnlg的时候f1/precision/recall 只有0.218 0.182 0.199 会是什么原因呢。我尝试过将cell换成gru，然后看了上面的回答把learning rate改成0.01，也只是提升到0.352 0.358 0.346 。如果可以的话想请教一下改进的方法。

dataset problem

hi, Dr Zeng.When I use nyt dataset to train.I can't foud the files shown in the next img.Could you upload them?

data get

How can I get raw_train.json,raw_test.json,raw_valid.json

NYT原始数据集

您好！
请问您有NYT原始数据集（包括训练数据集、验证数据集、测试数据集）吗？能否发我一份，我的邮箱是：[email protected].
谢谢您！

关于实体抽取部分

您好，我想请教一下。我看train_Data中三元组是指向的实体在句子中开始的位置，那么由两个单词组成的实体是怎么做到抽取后一个词的？

一些疑惑

    您好！打扰了，这几天拜读了您的论文，被论文巧妙的构思、扎实的工作深深折服了。看论文的时候仍存在一些疑问，希望得到您的解答。

17年ACL这篇《Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme》提到的NYT数据集已经提供有测试集，为什么您还要自己划分呢？还有一点我不明白的是，为什么要过滤长度大于100的句子？
您在论文中提到抽取三元组，但是您的论文只能抽取实体的最后一个单词。
希望能得到您的答复。谢谢~

What is the accuracy, precision report in paper?

if the triplet entities contain multiple words, your sequence decoding pipeline will never get the right triplet out, so whether the multiple word triplets are all count wrong?
or you just evaluate it based on the inaccruate tagging target?

About the dependencies

In order to reproduce the result, can you give the "requirements.txt" or include it in README?
I don't know which version of tensorflow to install...
Looking forward to your answer!

WebNLG原始数据

您好！
请问您有对WebNLG语料划分之后的原始数据集（包括训练数据集、验证数据集、测试数据集）吗？能否发我一份，我的邮箱是：[email protected].
谢谢您！

webnlg数据集问题

您好，我使用了您提供的webnlg的原始数据集和json处理程序，将7个文件夹下的文件进行了合并，最后处理完后，显示合法的样例是5576条（训练集5076，验证集500条），而论文中训练集是5019条，请问是我遗漏了什么预处理吗？（entry number 7002，valid instance number 5576）

Where is the embedding for relations?

I just found 'words2id','relations2id' and 'words_id2vector'. where is the embedding for relations?

How to recover triples from ids for webnlg?

Hello, I am trying to map the triples_ids for the webnlg data back to the original triples in string form.
I understood after some time (it could be helpful to explain this clearly in the readme) that the data is formatted as (sentence_ids, triples_ids) and that each triples_ids is a list of dimension 3*n_triples_in_sentence composed as:

[subject_end_position, object_end_position, relation_id for i in range(n_triples_in_sentence)]

but without the entity_start_position how can you recover the original triple?
Alternatively is there any way to map the instance of your modified dataset to the original instances in the webnlg data?
Thanks!

模型评价时也仅考虑实体的“最后一个词”吗？

我理解模型目前为实体仅拷贝一个词（训练时使用最后一个词），但是评测的时候也仅认为标准答案中所有实体都是最后一个词吗？
目前看处理之后的数据都仅仅记录了最后一个词的位置而没有记录初始位置，训练集使用这个没有问题，但测试和开发也这么处理会不会就有问题了？

f1/precision/recall

hi, Dr Zeng, when I use the default setting to train on webnlg dataset, the f1/precision/recall seems to be 0.00.. and the entity in visualized result repeated twice. still can't figure out what's wrong and looking forward to your reply~

数据集处理问题

您好！
请问你在处理数据时先过滤所有仅包含None关系的句子吗？

relation facts order question？

Have you considered that relation facts order will affect the loss count？
for example：
target triplets may be like this: [(1,2,3),(4,5,6)], which should be equal to [(4,5,6), (1,2,3)]. with the way counting loss that paper mentioned, it may be quit different.

xiangrongzeng / copy_re Goto Github PK

copy_re's People

Contributors

Stargazers

Watchers

Forkers

copy_re's Issues

Recommend Projects

Recommend Topics

Recommend Org