cmavro / rearev_kgqa Goto Github PK

View Code? Open in Web Editor NEW

20.0 2.0 1.0 122 KB

[EMNLP Findings 2022] ReaRev: Adaptive Reasoning for Question Answering over Knowledge Graphs

Home Page: https://aclanthology.org/2022.findings-emnlp.181/

License: MIT License

Python 100.00%

kgqa knowledge-graph nlp-machine-learning question-answering graph-neural-networks

rearev_kgqa's Introduction

ReaRev [EMNLP 2022]

This is the code for the EMNLP 2022 Findings paper: ReaRev: Adaptive Reasoning for Question Answering over Knowledge Graphs.

Overview

Our methods improves instruction decoding and execution for KGQA via adaptive reasoning, as shown:

Get Started

We have simple requirements in `requirements.txt'. You can always check if you can run the code immediately.

We use the pre-processed data from: https://drive.google.com/drive/folders/1qRXeuoL-ArQY7pJFnMpNnBu0G-cOz6xv Download it and extract it to a folder named "data".

Acknowledgements:

NSM: Datasets (webqsp, CWQ, MetaQA) / Code.

GraftNet: Datasets (webqsp incomplete, MetaQA) / Code.

Training

To run Webqsp:

python main.py ReaRev --entity_dim 50 --num_epoch 200 --batch_size 8 --eval_every 2 \ 
--data_folder data/webqsp/ --lm sbert --num_iter 3 --num_ins 2 --num_gnn 2 \
--relation_word_emb True --experiment_name Webqsp322 --name webqsp

To run CWQ:

python main.py ReaRev --entity_dim 50 --num_epoch 100 --batch_size 8 --eval_every 2 \
--data_folder data/CWQ/ --lm sbert --num_iter 2 --num_ins 3 --num_gnn 3 \
--relation_word_emb True --experiment_name CWQ --name cwq

To run MetaQA-3:

python main.py ReaRev --entity_dim 50 --num_epoch 10 --batch_size 8 --eval_every 2  \
--data_folder data/metaqa-3hop/  --lm lstm --num_iter 2 --num_ins 3 --num_gnn 3  \
--relation_word_emb False --experiment_name metaqa3 --name metaqa

For incomplete Webqsp, see 'data/incomplete/' (after obtaining them by GraftNet). If you cannot afford a lot of memory for CWQ, use the '--data_eff' argument (see our arguments in `parsing.py').

Results

We also provide some pretrained ReaRev models (ReaRev_webqsp.ckpt, ReaRev_webqsp_v2.ckpt, ReaRev_CWQ.ckpt). You can download them from here. Please extract them to a folder `checkpoint/pretrain/'.

To reproduce Webqsp results, run:

python main.py ReaRev --entity_dim 50 --num_epoch 200 --batch_size 8 --eval_every 2 --data_folder data/webqsp/ --lm sbert --num_iter 3 --num_ins 2 --num_gnn 3 --relation_word_emb True --load_experiment ReaRev_webqsp.ckpt --is_eval --name webqsp

python main.py ReaRev --entity_dim 50 --num_epoch 200 --batch_size 8 --eval_every 2 --data_folder data/webqsp/ --lm sbert --num_iter 3 --num_ins 2 --num_gnn 2 --relation_word_emb True --load_experiment ReaRev_webqsp_v2.ckpt --is_eval --name webqsp

To reproduce CWQ results, run:

python main.py ReaRev --entity_dim 50 --num_epoch 100 --batch_size 8 --eval_every 2 --data_folder .data/CWQ/ --lm sbert --num_iter 2 --num_ins 3 --num_gnn 3 --relation_word_emb True --load_experiment ReaRev_CWQ.ckpt --is_eval --name cwq

Models	Webqsp	CWQ	MetaQA-3hop
KV-Mem	46.7	21.1	48.9
GraftNet	66.4	32.8	77.7
PullNet	68.1	45.9	91.4
NSM-distill	74.3	48.8	98.9
ReaRev	76.4	52.9	98.9

Cite

If you find our code or method useful, please cite our work as

@article{mavromatis2022rearev,
  title={ReaRev: Adaptive Reasoning for Question Answering over Knowledge Graphs},
  author={Mavromatis, Costas and Karypis, George},
  journal={arXiv preprint arXiv:2210.13650},
  year={2022}
}

@inproceedings{mavromatis-karypis-2022-rearev,
    title = "{R}ea{R}ev: Adaptive Reasoning for Question Answering over Knowledge Graphs",
    author = "Mavromatis, Costas  and
      Karypis, George",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2022",
    month = dec,
    year = "2022",
    address = "Abu Dhabi, United Arab Emirates",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.findings-emnlp.181",
    pages = "2447--2458",
}

rearev_kgqa's People

Contributors

Stargazers

Watchers

Forkers

moguizhizi

rearev_kgqa's Issues

Storage issues with the CWQ dataset

Hi!
I have a question about running CWQ data set. When I use 3090 graphics card, there will be Killed. Is it a video memory problem?

What are the real meaning of entities?

Dear authors, thank you for your work and your kind reply very much!
In the WebQSP dataset you provided, the file entites.txt store entities like "m.04hxgbs". I want to know the real meanings of the entities and understand the reasoning process better. Is there a file storing the real meanings of each entity one by one?
A lot of thanks for your reply!

self attention

Hello, I saw in the paper that different instruction nodes can be fused using self attention. Can you explain the process in detail

Experiments on the MetaQA-3 dataset

Hello, thank you for providing the code! I am having trouble running the experiment on MetaQA-3. The code is showing the following error:
Traceback (most recent call last):
File "main.py", line 52, in
main()
File "main.py", line 40, in main
trainer.train(0, args.num_epoch - 1)
File "E:\ReaRev_KGQA-main\train_model.py", line 125, in train
loss, extras, h1_list_all, f1_list_all = self.train_epoch(totle_step)
File "E:\ReaRev_KGQA-main\train_model.py", line 219, in train_epoch
loss, _, _, tp_list = self.model(batch, training=True)
File "C:\Anaconda3\envs\wikidataqa\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "E:\ReaRev_KGQA-main\models\ReaRev\rearev.py", line 188, in forward
self.init_reason(curr_dist=current_dist, local_entity=local_entity,
File "E:\ReaRev_KGQA-main\models\ReaRev\rearev.py", line 137, in init_reason
rel_features, rel_features_inv = self.get_rel_feature()
File "E:\ReaRev_KGQA-main\models\ReaRev\rearev.py", line 100, in get_rel_feature
rel_features = self.instruction.question_emb(self.rel_features)
File "C:\Anaconda3\envs\wikidataqa\lib\site-packages\torch\nn\modules\module.py", line 1130, in __getattr__raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'LSTMInstruction' object has no attribute 'question_emb'

Then I noticed that question_emb seemed to be undefined in LSTMInstruction. I am looking forward to your answer. Thank you.

KG incompleteness experiments

Hi, I hope you are doing well.

I was curious if the --fact_dropout is the argument you used to retrieve the Low-Data Regime results in your paper.

Do I simply do --fact_dropout 0.9 to experiment on the 10% data regime?

This experiment setting randomly drops a certain portion of the facts every iteration. Am I correct?

Thank you very much in advance :)

The running result

Hello, thank you for your code. I trained the model on Webqsp and CWQ datasets in the way you described. All the parameters are consistent with the parameters you provided. Finally, the results on Websp are F1: 0.7028, H1: 0.7498, and the results on CWQ are H1: 0.5251, which does not reach the results you provided in the paper. I still can't get results after modifying the batchsize and epoch，What is the reason for this？ THX

Is there a webqsp dataset of 10%, 30%, 50% KG after preprocessing?

KG incompleteness experiments for Webqsp

Your work is very interesting.
I want to conduct experiments for Webqsp 50% KG completeness. However, when I change the parameter 'fact_drop' in parsing.py to 0.5, the result is 71.14 for H1 and 58.62 for F1. Can you tell me what is the correct solution? Thanks a lot!

where to get the base==1.0.4 version from?

Dear author, I wonder where to get the base1.0.4 version from? I can only query base 0.0.0.

KG incompleteness experiments for MateQA-3

In your paper, you obtained good results on the incomplete datasets of Webqsp and MateQA-3. Can you provide the incomplete data you used in your experiment on MateQA-3? THX

Are the training parameters of webquestion under different KGs the same as the full KG training?

dataset

Hello, your code has been very helpful to me, but I have a question. How can I replace this dataset with my own dataset

How to find the 'data/incomplete/' folder

Hello, I would like to know where should I download the incomplete dataset? looking forward to your reply.

MetaQA experiment protocol

Hi, it's me again.
I got curious about the MetaQA-3 experiment protocol you used in the paper.
Is the MetaQA dataset uploaded in your drive the full dataset or the low-regime experiment dataset?
If it's the full dataset, then how can I reduce the portion of Train QAs, as done in your paper?
Many thanks! :)

Some questions about the results

Hello, thank you very much for your contribution, I have some questions here, I browsed the web322log file you provided, the final result is：
2022-10-14 21:09:33,881 - root - INFO - Train Done! Evaluate on testset with saved model
2022-10-14 21:09:48,012 - root - INFO - Best h1 evaluation
2022-10-14 21:09:48,012 - root - INFO - TEST F1: 0.6879, H1: 0.7529
2022-10-14 21:10:02,328 - root - INFO - Best f1 evaluation
2022-10-14 21:10:02,328 - root - INFO - TEST F1: 0.7122, H1: 0.7578
2022-10-14 21:10:16,537 - root - INFO - Final evaluation
2022-10-14 21:10:16,537 - root - INFO - TEST F1: 0.7166, H1: 0.7596
and the results provided by your thesis：
f1:70.9 h1:76.4
There are some differences, but what outcome do you end up using, how should you use this result, whether it is the best of the three results or the final round, or the average of the three
Hope to get your answer, thank you