cartus / aggcn Goto Github PK

Attention Guided Graph Convolutional Networks for Relation Extraction (authors' PyTorch implementation for the ACL19 paper)

License: MIT License

Python 98.97% Shell 1.03%

graph-neural-networks graph-convolutional-networks information-extraction relation-extraction nlp deep-learning

aggcn's People

Contributors

Stargazers

Watchers

Forkers

futong weigoss vhientran gopikachu88 prashant118 chenq1114 xiaojie2018 bobobe tj1116 zxlzr hi-zenanxu almoslmi dreaminvoker liuyijiang1994 tommorning wt123u tangshisong sxrczh memozhu yangli1221 yanghuili shiqing1234 liuzhichenger xw-jia rookieisme cenjat troublemeeter azdatascience baymaxoct chunningdu hjfeilg imnujf ys7yoo punchwes ammieqi gccyxy thoye shark803 hxl523 gstoica27 bsinghpratap foxlf823 sjliu0920 davidie phychaos archerzj xiaotingzizhang qin-folks hawksilent freedomkite fangruizeng claniakea aliqiao huhuhu2345 yanzhangnlp kanglicheng sorrowyn greitzmann jingyangzhang0222 divinetx fer-git li-ming-fan zohaibramzan johnson7788 elveshh hjqhjq minusvip guojinyu88 chrisjanwust ohnozzy pengwork jasmine-yu yuansurex sohailkhanmarwat knowledge-graph-relation-extraction fusion-research mehrnooshbazrafkan techthiyanes awyys altraman hyl2048 harishgovardhandamodar lv184614886 saleh-sha wanwano wonlee2019

aggcn's Issues

关于模型中的M个block的问题

您好，我理解的是AGGCN中的每个block都含有三个层，其中对应注意力头的个数含有N个密集连接层。其中N个密集连接层的输出拼接后通过linear combination layer得到当前block的输出的句子表示。最终句子的表示请问是M个block的输出拼接吗，即与的关系是什么

Questions about the updated results

Hi, In the issue 7, you said that
"For the mean and std of F1 score in my experiments, the stats is 68.2% +- 0.5%. Also, we will update this score on our paper, for a fair an concrete comparison to other methods."

However, In the final version of ACL19 and the latest version in arxiv, the F1 score on TACRED are both 69.0. May I ask where the latest results were released?

Dependency type

Hi @Cartus
Thank you very much for your sharing. I would like to ask whether the dependency type information is used or whether only the edge between the nodes is considered and the specific edge type is not considered?

h_out's dimension problem ??

As you mentioned in your paper, h_out belongs to d * N. Shouldn't he belong to (d * n) * n?

RuntimeError: cuda runtime error (38) : no CUDA-capable device is detected at ..\src\THC\THCGeneral.cpp:70

I use your enviroment, bug I meet some issues，I use your code in windows10, bug it tell me "no CUDA-capable device is detected at ..\src\THC\THCGeneral.cpp:70"

What is the function of tensor "denom" ?

In the Class of GraphConvLayer and MultiGraphConvLayer, there is a tensor called denom. I have tried to understand its meaning,however I failed.So please tell me this tensor's meaning.thanks!

` def forward(self, adj, gcn_inputs):
# gcn layer
denom = adj.sum(2).unsqueeze(2) + 1
outputs = gcn_inputs
cache_list = [outputs]
output_list = []

    for l in range(self.layers):  # 0, 1
        Ax = adj.bmm(outputs)
        AxW = self.weight_list[l](Ax)
        AxW = AxW + self.weight_list[l](outputs)  # self loop
        AxW = AxW / denom 
        gAxW = F.relu(AxW)
        cache_list.append(gAxW)
        outputs = torch.cat(cache_list, dim=2) 
        output_list.append(self.gcn_drop(gAxW))`

Evaluation score lower than reported

Hi,
I retrain the model with 5 different random seeds on TACRED. However, the average F1 score is 67.116(+-0.121), which is much lower than the reported score in your paper. Is the default model config correct? Also, how large is the std in your experiments?

how to preprocess the dataset

I would like to use AGGCN on my own dataset. Could you please kindly provide the preprocess code? Thank you!!

What method did you use to generate the dependency trees for the semeval dataset

Hi there. Thanks for posting your code!

I'm interested in applying this methodology to other datasets. My reading of it is that you preprocess data to derive the dependency tree before training. What method did you use to do this for the Semeval dataset, as this doesn't have the dependency tree annotated in the original data?

Question about the adjacent matrix

Hi~

In your abstract, you said "In this work, we propose Attention Guided Graph Convolutional Networks (AGGCNs), a novel model which directly takes full dependency trees as inputs", but I don's see how do you make use of the dependency tree.

According to Equation (2), seems you just construct the $\tilde{A}$ by using the input embeddings, so where do you make use of the original adjacency matrix $A$?

In your code, I still cannot confirm how do you make use of $A$, which is adj in your code, right?
https://github.com/Cartus/AGGCN_TACRED/blob/master/model/aggcn.py#L175

Could you please show me the details?
And by the way, that is the matrix $V$ in your Equation (2)? I can not find the definition of $V$ neither.

F1=0

https://github.com/yuhaozhang/tacred-relation.git
I am using tacred dataset from this github and F1=0. Can you help me to explain why. Tks alot

I found an error: train.py: error: argument --id: expected one argument

While running bash script 'aggcn.sh' for semeval dataset on terminal, i found the above error. I tried to fix it but could not. Kindly help.

About replacing data sets

hello,
Sorry for disturbing you,When I replaced the data set with a Chinese data set, the loss became extremely huge. What is the reason?
Finetune all embeddings.
epoch 1: train_loss = 768000105463386346618880.000000
model saved to ./saved_models/01/checkpoint_epoch_2.pt

epoch 2: train_loss = 768000105463386346618880.000000
model saved to ./saved_models/01/checkpoint_epoch_3.pt

epoch 3: train_loss = 768000105463386346618880.000000
model saved to ./saved_models/01/checkpoint_epoch_4.pt

epoch 4: train_loss = 768000105463386346618880.000000
model saved to ./saved_models/01/checkpoint_epoch_5.pt

epoch 5: train_loss = 768000105463386346618880.000000
model saved to ./saved_models/01/checkpoint_epoch_6.pt

epoch 6: train_loss = 768000105463386346618880.000000
model saved to ./saved_models/01/checkpoint_epoch_7.pt

2020-12-23 08:47:06.380100: step 20/450 (epoch 7/150), loss = 576000088104739014705152.000000 (0.161 sec/batch), lr: 0.010000

About case study and attention visualization mentioned in paper

hello, @Cartus ,Thanks for your wonderful work! Do you have the case study and attention visualization mentioned in your paper, I think that it's useful for me to analyse the model.

Case study and attention visualization of this example are provided in the supplementary material

Appreciate that If you could make it available, Thanks in advance!

the Final Score

hello, I ran the code you gave and got the Final Score:
Precision (micro): 74.361%
Recall (micro): 30.617%
F1 (micro): 43.375%
Why is there a gap with the original result?

RuntimeError: cuda runtime error (100) : no CUDA-capable device is detected at /tmp/pip-req-build-ufslq_a9/aten/src/THC/THCGeneral.cpp:50

(GPU_pytorch) hz071@hamilton:~/AGGCNN/AGGCN$ bash train_aggcn.sh 1
Vocab size 53953 loaded from file
Loading data from dataset/tacred with batch size 50...
1363 batches created for dataset/tacred/train.json
453 batches created for dataset/tacred/dev.json
Config saved to file ./saved_models/01/config.json
Overwriting old vocab file at ./saved_models/01/vocab.pkl

Running with the following configs:
data_dir : dataset/tacred
vocab_dir : dataset/vocab
emb_dim : 300
ner_dim : 30
pos_dim : 30
hidden_dim : 300
num_layers : 2
input_dropout : 0.5
gcn_dropout : 0.5
word_dropout : 0.04
topn : 10000000000.0
lower : False
heads : 3
sublayer_first : 2
sublayer_second : 4
pooling : max
pooling_l2 : 0.002
mlp_layers : 1
no_adj : False
rnn : True
rnn_hidden : 300
rnn_layers : 1
rnn_dropout : 0.5
lr : 0.7
lr_decay : 0.9
decay_epoch : 5
optim : sgd
num_epoch : 100
batch_size : 50
max_grad_norm : 5.0
log_step : 20
log : logs.txt
save_epoch : 100
save_dir : ./saved_models
id : 1
info :
seed : 0
cuda : False
cpu : False
load : False
model_file : None
num_class : 42
vocab_size : 53953
model_save_dir : ./saved_models/01

Finetune all embeddings.
/home/hz071/.conda/envs/GPU_pytorch/lib/python3.7/site-packages/torch/nn/modules/rnn.py:50: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.5 and num_layers=1
"num_layers={}".format(dropout, num_layers))
THCudaCheck FAIL file=/tmp/pip-req-build-ufslq_a9/aten/src/THC/THCGeneral.cpp line=50 error=100 : no CUDA-capable device is detected
Traceback (most recent call last):
File "train.py", line 119, in
trainer = GCNTrainer(opt, emb_matrix=emb_matrix)
File "/home/hz071/AGGCNN/AGGCN/model/trainer.py", line 67, in init
self.model = GCNClassifier(opt, emb_matrix=emb_matrix)
File "/home/hz071/AGGCNN/AGGCN/model/aggcn.py", line 22, in init
self.gcn_model = GCNRelationModel(opt, emb_matrix=emb_matrix)
File "/home/hz071/AGGCNN/AGGCN/model/aggcn.py", line 47, in init
self.gcn = AGGCN(opt, embeddings)
File "/home/hz071/AGGCNN/AGGCN/model/aggcn.py", line 129, in init
self.layers.append(GraphConvLayer(opt, self.mem_dim, self.sublayer_first))
File "/home/hz071/AGGCNN/AGGCN/model/aggcn.py", line 205, in init
self.weight_list = self.weight_list.cuda()
File "/home/hz071/.conda/envs/GPU_pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 304, in cuda
return self._apply(lambda t: t.cuda(device))
File "/home/hz071/.conda/envs/GPU_pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 201, in _apply
module._apply(fn)
File "/home/hz071/.conda/envs/GPU_pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 223, in _apply
param_applied = fn(param)
File "/home/hz071/.conda/envs/GPU_pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 304, in
return self._apply(lambda t: t.cuda(device))
File "/home/hz071/.conda/envs/GPU_pytorch/lib/python3.7/site-packages/torch/cuda/init.py", line 197, in _lazy_init
torch._C._cuda_init()
RuntimeError: cuda runtime error (100) : no CUDA-capable device is detected at /tmp/pip-req-build-ufslq_a9/aten/src/THC/THCGeneral.cpp:50
(GPU_pytorch) hz071@hamilton:~/AGGCNN/AGGCN$ python
Python 3.7.10 (default, Feb 26 2021, 18:47:35)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.

import torch
torch.cuda.is_available()
True

I am using pytorch=1.4.0, python=3.7 and you can see above Cuda is available too. What should i do?

Question on SemEval2010_task8 dataset

May I ask your parameters setting on SemEval2010_task8 dataset?
I can't get the performance on SemEval2010_task8 in the paper.
Thank you so much!

Running AGGCN but came across an error (cuda runtime error (38))

Hello. Thank you for sharing your code.

My environment is Python 3.6.8, PyTorch 0.4.1, and CUDA 9.0.

I got errors as below:

THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/THC/THCGeneral.cpp line=74 error=38 : no CUDA-capable device is detected
Traceback (most recent call last):
File "train.py", line 119, in
trainer = GCNTrainer(opt, emb_matrix=emb_matrix)
File "/code/AGGCN/model/trainer.py", line 67, in init
self.model = GCNClassifier(opt, emb_matrix=emb_matrix)
File "/code/AGGCN/model/aggcn.py", line 22, in init
self.gcn_model = GCNRelationModel(opt, emb_matrix=emb_matrix)
File "/code/AGGCN/model/aggcn.py", line 47, in init
self.gcn = AGGCN(opt, embeddings)
File "/code/AGGCN/model/aggcn.py", line 129, in init
self.layers.append(GraphConvLayer(opt, self.mem_dim, self.sublayer_first))
File "/code/AGGCN/model/aggcn.py", line 205, in init
self.weight_list = self.weight_list.cuda()
File "/etc/anaconda3/envs/aggcn/lib/python3.6/site-packages/torch/nn/modules/module.py", line 258, in cuda
return self._apply(lambda t: t.cuda(device))
File "/etc/anaconda3/envs/aggcn/lib/python3.6/site-packages/torch/nn/modules/module.py", line 185, in _apply
module._apply(fn)
File "/etc/anaconda3/envs/aggcn/lib/python3.6/site-packages/torch/nn/modules/module.py", line 191, in _apply
param.data = fn(param.data)
File "/etc/anaconda3/envs/aggcn/lib/python3.6/site-packages/torch/nn/modules/module.py", line 258, in
return self._apply(lambda t: t.cuda(device))
RuntimeError: cuda runtime error (38) : no CUDA-capable device is detected at /opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/THC/THCGeneral.cpp:74

My GPU is empty, and all other example codes on GPU is fine.
Have you come across the error? Thank you.

Why the number of classes on the SemEval2010-Task 8 is only 10?

Hi @Cartus,
Sorry for disturbing you, but as I know, the number of classes in the SemEval2010-Task 8 is 19. (9 directed relations and a special Other class). Why the number of classes in your json semeval folder is only 10 (also in the dict LABEL_TO_ID in the file utils/constant.py)? I wonder that we are ignoring the directionality of relations?
Thank you for your consideration!

code discussion

what does the parameters sublayer_first and sublayer_second mean in your code?

请问代码中的mlp output layer 是用来干嘛的

micro F1 and macro F1

The F1 score in your paper is macro F1, but in your code you calculate the micro score.
Can you share us with the code to calculate macro F1 score?
Thank you so much!

适用于中文数据集吗？

您好，
注意到论文中的实验都是在英文语料上完成的，我现在想应用到中文语料上，对于中文如何构造邻接矩阵，您有什么建议吗？
我的数据集示例如下：
{ "doc_id": "1", "paragraphs": [ { "paragraph_id": "0", "paragraph": "**成人2型糖尿病胰岛素促泌剂应用的专家共识", "sentences": [ { "sentence_id": "0", "sentence": "**成人2型糖尿病胰岛素促泌剂应用的专家共识", "start_idx": 0, "end_idx": 22, "entities": [ { "entity_id": "T0", "entity": "2型糖尿病", "entity_type": "Disease", "start_idx": 4, "end_idx": 9 }, { "entity_id": "T1", "entity": "2型", "entity_type": "Class", "start_idx": 4, "end_idx": 6 }, { "entity_id": "T2", "entity": "胰岛素促泌剂", "entity_type": "Drug", "start_idx": 9, "end_idx": 15 } ], "relations": [ { "relation_type": "Drug_Disease", "relation_id": "R0", "head_entity_id": "T2", "tail_entity_id": "T0" }, { "relation_type": "Class_Disease", "relation_id": "R1", "head_entity_id": "T1", "tail_entity_id": "T0" } ] } ] } }

environment error

The version of PyTorch in Readme is 0.4.1, but the configuration environment cannot find that version. Can you provide the environment requirements for this project? It would be great if the requirements file was available!

train set performance

hi, I would like to ask why the model on train set can only reach about 0.84 f1-value, does it means the model is under-fitting ? thanks!

看了论文和代码想问一下，这实际上是一个关系分类的模型吗

semeval-10 和 TACRED这两个数据集是不是在验证集和测试集上都给定了实体对，然后预测该实体对的关系类别吗

所以这个模型不能直接给一句话，然后直接预测包含的实体对和关系吧...

some question about your paper

Hello, I read your thesis recently, and I found that the formula (2) in your paper seems to be different from the algorithm in the code. In your code, calculating the attention guided adjacency matrix, the V matrix is not used, but V appears in your formula (2). If you calculate according to the formula (2) in your paper, some problems may arise in the dimensions of the data.

关于准确率的问题

郭哥您好，在尝试复现论文代码时，发生了一些问题，执行语句为：python3 train.py --id $SAVE_ID --seed 0 --hidden_dim 300 --lr 0.7 --rnn_hidden 300 --num_epoch 100 --pooling max --mlp_layers 1 --num_layers 2 --pooling_l2 0.002

此时准确率均为100%且f1值均为0，详细截图如下：

希望您在方便时候能不吝赐教，谢谢！

Experiment on n-ary relation extraction

请问如何执行n-ary relation extraction实验的代码呢，可以给出command么？就是你们paper的table 1的结果。

非常感谢~~

some error about:"RuntimeError: cuda runtime error (100) : "

hello,when i run the " bash train_aggcn.sh",there are some errors.
but,i test my torch:
Default GPU Device: /device:GPU:0
Your tensorflow-gpu is available
Default GPU Device: Tesla P4
Your pytorch-gpu is available
and,my environment is
cuda:10.1
python:3.7.5
pytorch:1.3.
could you please help me to solve this error?
thank you very much!

Finetune all embeddings.
Traceback (most recent call last):
File "train.py", line 120, in
trainer = GCNTrainer(opt, emb_matrix=emb_matrix)
File "/home/yaoshengnan/AGGCN-master/semeval/model/trainer.py", line 68, in init
self.model = GCNClassifier(opt, emb_matrix=emb_matrix)
File "/home/yaoshengnan/AGGCN-master/semeval/model/aggcn.py", line 26, in init
self.gcn_model = GCNRelationModel(opt, emb_matrix=emb_matrix)
File "/home/yaoshengnan/AGGCN-master/semeval/model/aggcn.py", line 51, in init
self.gcn = AGGCN(opt, embeddings)
File "/home/yaoshengnan/AGGCN-master/semeval/model/aggcn.py", line 133, in init
self.layers.append(GraphConvLayer(opt, self.mem_dim, self.sublayer_first))
File "/home/yaoshengnan/AGGCN-master/semeval/model/aggcn.py", line 206, in init
self.weight_list = self.weight_list.cuda()
File "/root/anaconda3/envs/tf14/lib/python3.7/site-packages/torch/nn/modules/module.py", line 305, in cuda
return self._apply(lambda t: t.cuda(device))
File "/root/anaconda3/envs/tf14/lib/python3.7/site-packages/torch/nn/modules/module.py", line 202, in _apply
module._apply(fn)
File "/root/anaconda3/envs/tf14/lib/python3.7/site-packages/torch/nn/modules/module.py", line 224, in _apply
param_applied = fn(param)
File "/root/anaconda3/envs/tf14/lib/python3.7/site-packages/torch/nn/modules/module.py", line 305, in
return self._apply(lambda t: t.cuda(device))
File "/root/anaconda3/envs/tf14/lib/python3.7/site-packages/torch/cuda/init.py", line 193, in _lazy_init
torch._C._cuda_init()
RuntimeError: cuda runtime error (100) : no CUDA-capable device is detected at /pytorch/aten/src/THC/THCGeneral.cpp:50

Other dataset

Hi,
This code is about TACRED dataset, can you provide other code about other datasets mentioned in paper?
Thank!

About "M identical blocks"

Hello, I would like to ask you how to understand “M identical blocks” in the paper? and,What's the specific meaning of this? thank you!

Dependency information of SemEval 2010 Task8

Thank you for your sharing. We are very fortunate to have such a responsible author. Before you release the code for SemEval 2010 Task8, I want to know how you obtain the dependency information for it. By StanfordNLP?

Why did you configure the first densely connected layer with GraphConvLayer?

hello :) I ran the code you gave but I would like to ask you why you constructed the first densely connected layer (layers[0]) with GraphConvLayer.
The other layers(layers[1], ..., layers[N-1]) were configured with MultiGraphConvLayer, right?
please let me know any mistake. Thank you !

how to get standford_head and stanford_deprel for cross-sentence data

hello, when I tried to generate the two fields（stanford_head and stanford_deprel） with the stanfordnlp tool,and converted the result into a recognizable pattern for your code， I found a little difference between my results and yours(pubmed dataset).

for example, the sentence"2. acquired resistance mechanism 1 ) secondary t790m mutation of the egfr gene unfortunately , many of those patient who originally had respond eventually become insensitive to gefitinib or erlotinib therapy through acquire resistance ."

my output:

your output：

we can see that,in your output ,there're many "-1" value and "self" content.

So,I wonder how do you get the output?Is it because our tool versions are different?

How to test the n-ary relation extraction part of the experiment?

When I ran this command: python3 eval.py saved_models/01 --dataset PubMed
I encountered this problem: FileNotFoundError: [Errno 2] No such file or directory: 'dataset/tacred/PubMed.json'

Setting on SemEval2010_task8 dataset

I'm really grateful that you share the setting of TACRED dataset. Would you please to share the setting of SemEval2010_task8 dataset

关于AGGCN模型细节的问题

师兄您好！
看了AGGCN那部分封装的模型代码，有个地方不了解。
在gcn层中定义的MoedlList，其中包含4个子层。分别是两个GraphConvLayer和两个MultiGarphConvLayer。因此我没明白为啥只有后两层才使用注意力机制？我理解的是不是对每条文本都要使用注意力机制生成全连接加权图吗？

请问我怎么在semeval数据集上运行您的代码？

您好，
我发现项目里有关于semeval的数据集和相关代码，但是在README里面并没有提及关于semeval的数据集和相关代码的任何事情。
因此，我想问一下，如果要在semeval数据集上运行您的代码，需要做哪些事情？模型在semeval数据集上的得分情况怎么样？

Code to convert SemEval2010_task8 dataset

Sorry to bother you again, I check the data format in the code:
https://github.com/qipeng/gcn-over-pruned-trees
I have to convert SemEval2010_task8 dataset to TARCED's format, but I don't know what to fill the key subj_type and obj_type.
May I ask what to fill this two keys?
Or May I have your code for transfering SemEval2010_task8 dataset to TARCED's format?
Thank you so much!