Code Monkey home page Code Monkey logo

wsdm2021_nsm's Introduction

WSDM2021_NSM (Neural State Machine for KBQA)

PWC

PWC

This is our Pytorch implementation for the paper:

Gaole He, Yunshi Lan, Jing Jiang, Wayne Xin Zhao and Ji-Rong Wen (2021). Improving Multi-hop Knowledge Base Question Answering by Learning Intermediate Supervision Signals. paper, slides, poster, video, CN blog. In WSDM'2021.

Introduction

Multi-hop Knowledge Base Question Answering (KBQA) aims to find the answer entities that are multiple hops away in the Knowledge Base (KB) from the entities in the question. A major challenge is the lack of supervision signals at intermediate steps. Therefore, multi-hop KBQA algorithms can only receive the feedback from the final answer, which makes the learning unstable or ineffective. To address this challenge, we propose a novel teacher-student approach for the multi-hop KBQA task.

Requirements:

  • Python 3.6
  • Pytorch >= 1.3

Dataset

We provide three processed datasets in : WebQuestionsSP (webqsp), Complex WebQuestions 1.1 (CWQ), and MetaQA.

  • We follow GraftNet to preprocess the datasets and construct question-specific graph.
  • You can find instructions to obtain datasets used in this repo in preprocessing folder
  • You can also download preprocessed datasets from google drive, and unzip it into dataset folder, and use config --data_folder <data_path> to indicate it.
Datasets Train Dev Test #entity coverage
MetaQA-1hop 96,106 9,992 9,947 487.6 100%
MetaQA-2hop 118,980 14,872 14,872 469.8 100%
MetaQA-3hop 114,196 14,274 14,274 497.9 99.0%
webqsp 2,848 250 1,639 1,429.8 94.9%
CWQ 27,639 3,519 3,531 1,305.8 79.3%

Each dataset is organized with following structure:

  • data-name/
    • *.dep: file contains question id, question text and dependency parsing (not used in our code);
    • *_simple.json: dataset file, every line describes a question and related question-specific graph; you can find how this file is generated with simplify_dataset.py. Mainly map entity, relation to global id in entities.txt and relations.txt.
    • entities.txt: file contains a list of entities;
    • relations.txt: file contains list of relations.
    • vocab_new.txt: vocab file.
    • word_emb_300d.npy: vocab related glove embeddings.

Results

We provide result for : WebQuestionsSP (webqsp), Complex WebQuestions 1.1 (CWQ), and MetaQA.

  • We follow GraftNet to conduct evaluation. Baseline results come from original paper or related paper.
Models webqsp MetaQA-1hop MetaQA-2hop MetaQA-3hop CWQ
KV-Mem 46.7 96.2 82.7 48.9 21.1
GraftNet 66.4 97.0 94.8 77.7 32.8
PullNet 68.1 97.0 99.9 91.4 45.9
SRN - 97.0 95.1 75.2 -
EmbedKGQA 66.6 97.5 98.8 94.8 -
NSM 68.7 97.1 99.9 98.9 47.6
NSM+p 73.9 97.3 99.9 98.9 48.3
NSM+h 74.3 97.2 99.9 98.9 48.8

The leaderboard result for NSM+h is 53.9, and we get rank 2 at 22th May 2021. (We are supposed to be ranked as top-1, if we submit around WSDM 2021 ddl 17th August 2020.) leaderboad

Training Instruction

Download preprocessed datasets from google drive, and unzip it into dataset folder, and use config --data_folder <data_path> to indicate it. reported models for webqsp and CWQ dataset are available at google drive. use following args to run the code. make sure you created --checkpoint_dir, in the bash, it's supposed to have a 'checkpoint' folder in this repository.

mkdir checkpoint
example commands: run_webqsp.sh, run_CWQ.sh, run_metaqa.sh

You can directly load trained ckpt and conduct fast evaluation with appending --is_eval --load_experiment <ckpt_file> to example commands. Notice that --load_experiment config only accept relative path to --checkpoint_dir.

you can get detailed evaluation information about every question in test set, saved as file in --checkpoint_dir. For more details, you can refer to NSM/train/evaluate_nsm.py.

Important arguments:

--data_folder          Path to load dataset.
--checkpoint_dir       Path to save checkpoint and logs.
--num_step             Multi-hop reasoning steps, hyperparameters.
--entity_dim           Hidden size of reasoning module.
--eval_every           Number of interval epoches between evaluation.
--experiment_name      The name of log and ckpt. If not defined, it will be generated with timestamp.
--eps                  Accumulated probability to collect answers, used to generate answers and affect Precision, Recalll and F1 metric.
--use_self_loop        If set, add a self-loop edge to all graph nodes.
--use_inverse_relation If set, add reverse edges to graph.
--encode_type          If set, use type layer initialize entity embeddings. 
--load_experiment      Path to load trained ckpt, only relative path to --checkpoint_dir is acceptable. 
--is_eval              If set, code will run fast evaluation mode on test set with trained ckpt from --load_experiment option.
--reason_kb            If set, model will reason step by step. Otherwise, model may focus on all nodes on graph every step.
--load_teacher         Path to load teacher ckpt, only relative path to --checkpoint_dir is acceptable. 

Acknowledgement

Any scientific publications that use our codes and datasets should cite the following paper as the reference:

@inproceedings{He-WSDM-2021,
    title = "Improving Multi-hop Knowledge Base Question Answering by Learning Intermediate Supervision Signals",
    author = {Gaole He and
              Yunshi Lan and
              Jing Jiang and
              Wayne Xin Zhao and
              Ji{-}Rong Wen},
    booktitle = {{WSDM}},
    year = {2021},
}

Nobody guarantees the correctness of the data, its suitability for any particular purpose, or the validity of results based on the use of the data set. The data set may be used for any research purposes under the following conditions:

  • The user must acknowledge the use of the data set in publications resulting from the use of the data set.
  • The user may not redistribute the data without separate permission.
  • The user may not try to deanonymise the data.
  • The user may not use this information for any commercial or revenue-bearing purposes without first obtaining permission from us.

wsdm2021_nsm's People

Contributors

richardhgl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

wsdm2021_nsm's Issues

Step5 KeyError occurred

When I run simplify_dataset.py, it turns out that there are some irregular keys in json file train.json. Data in this file is splited automatically from CWQ_step1_01.json using train_test_split from sklearn.

The strange keys include: ' royalty' which is supposed to be 'royalty' , ' Using various tricks of light, perspective and erasure...'
(both are in entity2id[obj['text']])

I guess there might be some errors in CWQ/subgraph/subgraph_hop2.txt or preprocess_step1.py. And if needed I can provide you with the code I separate the data from CWQ_step1_01.json.

The output is here:

Traceback (most recent call last):
  File "simplify_dataset.py", line 70, in <module>
    simplify_data(input_file, output_file, entity2id, relation2id)
  File "simplify_dataset.py", line 42, in simplify_data
    tp_dict["subgraph"]["tuples"] = simplify_tuples(tp_dict["subgraph"]["tuples"], entity2id, relation2id)
  File "simplify_dataset.py", line 27, in simplify_tuples
    tail = entity2id[obj['text']]
KeyError: '       Using various tricks of light, perspective and erasure, the artworks in Shadows, Disappearances and Illusions each short-circuit the connection between the eye and the brain.'

Some question about the code

What do the reason_layer do in the GNN_reasoning.py file? What is the meaning of the variable: fact_rel, fact_prior, fact_value, fact_query, and neighbor_rep?
I refer the implement of Graft-net, but I still don't understand. Thank you for your reply.

A question about CWQ test dataset

Does the ComplexWebQuestions_test_wans.json file that appears in the code contain the answer to the question? How do I find this file?

subgraph coverage for webqsp

Hello,
I am using GraftNet preprocessing to generate question subgraphs for webqsp dataset.
The number of questions is 4737 and the number of subgraphs which cover the answer entity is equal to 4392. So, the recall will be around 0.89.
Could u please explain me how to achieve 0.94% coverage as u have mentioned in the table.

about q_input

你好,我试图尝试不同的q_input (basic_dataset中的query_text)对实验结果的影响。遇到如下问题:

我发现:
在NSM/data/basic_dataset.py文件中调用_prepare_dep函数会将训练数据的.dep文件的依存结果解析出来,但是query_text只取了每个token,相当于直接取问句的每个token。

改动:
注释掉_prepare_dep函数的调用,在_prepare_data函数中直接解析问句(question)的token,来构造query_text(这部分原来是被注释的)。

结果:
经过上述改动,NSM, NSM+h teacher的测试F1 H1都下降了2%, NSM+h student也下降了一些。

问题:
想问一下训练数据.dep中的依存树对结果是否有直接影响,如果有,是在什么位置产生影响?

感谢!

The way to get triples

您好! 请问下您在进行pagerank之前,是如何获得Freebase三元组的?

请问有方法获得(实体,关系, 值)的三元组吗, 目前是只有(实体,关系,实体)的三元组吗

NotImplementedError

你好,请问下我在运行 main_nsm.py 时出现 NotImplementedError 报错,能否为我解答下,万分感谢

MetaQA-3hop test coverage 100%

Hi, may I know how you check the MetaQA-3hop test coverage? In the paper, the coverage is 99%. But I check it myself, the coverage is 100%. I want to know whether there is a difference between our metrics.
I simple find all the 3hop entities to the topic entity, and if any answer node is contained in these 3hop entities, I think this test question is covered.

How can I use my own data to train the model?

Hi, I have a question for training and evaluating the model.

I have run the code on CWQ successfully. Now I want to use my own data to train but don't know how.

My data is just a modified CWQ with a little change of the question. So I tried to find where you load the question but failed.

I found that you have query_texts commented on NSM/data/basic_dataset.py.

Can you tell me how to change the question or how I can use my own data to train?

a short description regarding the files need to be run

Hi,
Could you explain the files need to be run briefly?
According to paper, I expect that student network employs the teacher network but I am little bit confused among different folders/files.
Thanks for your help!

KB id to entity names mapping

Hi very interesting work!

I noticed in CWQ/entities.txt and webqsp/entities.txt only KG Ids are included.
Is the mapping from these ids to their textual names somewhere stored? Or should I run the preprocessing steps myself to obtain them?

Thanks,
Costas

nsm-final.ckpt文件请求

您好,可以发我一份/checkpoint/pretrain中的nsm-final.ckpt文件吗,我想优化teacher部分,已公布的ckpt_report文件中仅有teacher和student的,如果可以的话万分感谢,邮箱地址是[email protected]

AssertionError: assert not torch.isnan(f2e_emb).any()

您好!我在运行python main_teacher.py的过程中在NSM/Modules/layer_nsm.py这个文件中的第56行报错:AssertionError ,原因是f2e_emb中含有NAN值。训练中第一次梯度更新之后,就报了该错。请问如何解决?

日志如下:
Traceback (most recent call last):
File "main_teacher.py", line 136, in
main()
File "main_teacher.py", line 124, in main
trainer.train(0, args.num_epoch - 1)
File "/home/zhangzy/WSDM2021_NSM-main/NSM/train/trainer_hybrid.py", line 83, in train
loss, extras, h1_list_all, f1_list_all = self.train_epoch()
File "/home/zhangzy/WSDM2021_NSM-main/NSM/train/trainer_hybrid.py", line 161, in train_epoch
loss, extras, _, tp_list = self.student(batch, training=True)
File "/home/zhangzy/miniconda3/envs/MSQA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/zhangzy/WSDM2021_NSM-main/NSM/Agent/TeacherAgent.py", line 24, in forward
return self.model(batch, training=training)
File "/home/zhangzy/miniconda3/envs/MSQA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/zhangzy/WSDM2021_NSM-main/NSM/Model/hybrid_model.py", line 143, in forward
self.init_reason(curr_dist=current_dist, local_entity=local_entity,
File "/home/zhangzy/WSDM2021_NSM-main/NSM/Model/hybrid_model.py", line 53, in init_reason
self.local_entity_emb = self.get_ent_init(local_entity, kb_adj_mat, self.rel_features)
File "/home/zhangzy/WSDM2021_NSM-main/NSM/Model/base_model.py", line 124, in get_ent_init
local_entity_emb = self.type_layer(local_entity=local_entity,
File "/home/zhangzy/miniconda3/envs/MSQA/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/zhangzy/WSDM2021_NSM-main/NSM/Modules/layer_nsm.py", line 56, in forward
assert not torch.isnan(f2e_emb).any()
AssertionError

FileNotFoundError: [Errno 2] No such file or directory: 'checkpoint/CWQ_student/../CWQ_teacher/CWQ_hybrid_teacher-final.ckpt'

I run run_CWQ.sh. Here is all log, i have download all data file.

Why is can still run after get error.

How many memory does it need, my GPU is 24GB but out of memory.

Thx

(grailqa) xh@4210GPU:~/PycharmProject/NSM$ bash run_CWQ.sh
run_CWQ.sh: line 3: $'\r': command not found
2022-03-03 14:47:55,155 - root - INFO - PARAMETER----------
2022-03-03 14:47:55,155 - root - INFO - BATCH_SIZE=20
2022-03-03 14:47:55,156 - root - INFO - CHAR2ID=chars.txt
2022-03-03 14:47:55,156 - root - INFO - CHECKPOINT_DIR=checkpoint/CWQ_teacher/
2022-03-03 14:47:55,156 - root - INFO - CONSTRAIN_TYPE=js
2022-03-03 14:47:55,156 - root - INFO - DATA_FOLDER=dataset/CWQ/
2022-03-03 14:47:55,156 - root - INFO - DECAY_RATE=0.0
2022-03-03 14:47:55,156 - root - INFO - ENCODE_TYPE=True
2022-03-03 14:47:55,156 - root - INFO - ENCODER_TYPE=lstm
2022-03-03 14:47:55,156 - root - INFO - ENTITY2ID=entities.txt
2022-03-03 14:47:55,156 - root - INFO - ENTITY_DIM=50
2022-03-03 14:47:55,156 - root - INFO - ENTITY_EMB_FILE=None
2022-03-03 14:47:55,156 - root - INFO - ENTITY_KGE_FILE=None
2022-03-03 14:47:55,156 - root - INFO - ENTROPY_WEIGHT=0.0
2022-03-03 14:47:55,156 - root - INFO - EPS=0.95
2022-03-03 14:47:55,156 - root - INFO - EVAL_EVERY=2
2022-03-03 14:47:55,156 - root - INFO - EXPERIMENT_NAME=CWQ_hybrid_teacher
2022-03-03 14:47:55,156 - root - INFO - FACT_DROP=0
2022-03-03 14:47:55,156 - root - INFO - FACT_SCALE=3
2022-03-03 14:47:55,156 - root - INFO - FILTER_LABEL=False
2022-03-03 14:47:55,157 - root - INFO - FILTER_SUB=False
2022-03-03 14:47:55,157 - root - INFO - GRADIENT_CLIP=1.0
2022-03-03 14:47:55,157 - root - INFO - IS_EVAL=False
2022-03-03 14:47:55,157 - root - INFO - KG_DIM=100
2022-03-03 14:47:55,157 - root - INFO - KGE_DIM=100
2022-03-03 14:47:55,157 - root - INFO - LABEL_F1=0.5
2022-03-03 14:47:55,157 - root - INFO - LABEL_FILE=None
2022-03-03 14:47:55,157 - root - INFO - LABEL_SMOOTH=0.1
2022-03-03 14:47:55,157 - root - INFO - LAMBDA_BACK=0.1
2022-03-03 14:47:55,157 - root - INFO - LAMBDA_CONSTRAIN=0.01
2022-03-03 14:47:55,157 - root - INFO - LAMBDA_LABEL=0.01
2022-03-03 14:47:55,157 - root - INFO - LINEAR_DROPOUT=0.2
2022-03-03 14:47:55,157 - root - INFO - LOAD_EXPERIMENT=../pretrain/CWQ_nsm-final.ckpt
2022-03-03 14:47:55,157 - root - INFO - LOAD_PRETRAIN=None
2022-03-03 14:47:55,157 - root - INFO - LOG_LEVEL=info
2022-03-03 14:47:55,157 - root - INFO - LOSS_TYPE=kl
2022-03-03 14:47:55,157 - root - INFO - LR=0.0005
2022-03-03 14:47:55,157 - root - INFO - LR_SCHEDULE=False
2022-03-03 14:47:55,158 - root - INFO - LSTM_DROPOUT=0.3
2022-03-03 14:47:55,158 - root - INFO - MODE=teacher
2022-03-03 14:47:55,158 - root - INFO - MODEL_NAME=gnn
2022-03-03 14:47:55,158 - root - INFO - NAME=webqsp
2022-03-03 14:47:55,158 - root - INFO - NUM_EPOCH=70
2022-03-03 14:47:55,158 - root - INFO - NUM_LAYER=1
2022-03-03 14:47:55,158 - root - INFO - NUM_STEP=4
2022-03-03 14:47:55,158 - root - INFO - PRETRAINED_ENTITY_KGE_FILE=entity_emb_100d.npy
2022-03-03 14:47:55,158 - root - INFO - Q_TYPE=seq
2022-03-03 14:47:55,158 - root - INFO - REASON_KB=True
2022-03-03 14:47:55,158 - root - INFO - REL_WORD_IDS=rel_word_idx.npy
2022-03-03 14:47:55,158 - root - INFO - RELATION2ID=relations.txt
2022-03-03 14:47:55,158 - root - INFO - RELATION_EMB_FILE=None
2022-03-03 14:47:55,158 - root - INFO - RELATION_KGE_FILE=None
2022-03-03 14:47:55,158 - root - INFO - SEED=19960626
2022-03-03 14:47:55,158 - root - INFO - SHARE_EMBEDDING=False
2022-03-03 14:47:55,158 - root - INFO - SHARE_ENCODER=False
2022-03-03 14:47:55,158 - root - INFO - SHARE_INSTRUCTION=False
2022-03-03 14:47:55,158 - root - INFO - TEACHER_TYPE=hybrid
2022-03-03 14:47:55,159 - root - INFO - TEST_BATCH_SIZE=40
2022-03-03 14:47:55,159 - root - INFO - TRAIN_KL=False
2022-03-03 14:47:55,159 - root - INFO - TREE_SOFT=False
2022-03-03 14:47:55,159 - root - INFO - USE_CUDA=True
2022-03-03 14:47:55,159 - root - INFO - USE_INVERSE_RELATION=False
2022-03-03 14:47:55,159 - root - INFO - USE_LABEL=False
2022-03-03 14:47:55,159 - root - INFO - USE_SELF_LOOP=True
2022-03-03 14:47:55,159 - root - INFO - WORD2ID=vocab_new.txt
2022-03-03 14:47:55,159 - root - INFO - WORD_DIM=300
2022-03-03 14:47:55,159 - root - INFO - WORD_EMB_FILE=word_emb_300d.npy
2022-03-03 14:47:55,159 - root - INFO - -------------------
building word index ...
Entity: 2429346, Relation in KB: 6649, Relation in use: 6650
loading data from dataset/CWQ/train_simple.json
27639it [02:00, 229.81it/s]
skip {13194, 9931, 17485, 10670, 17373, 1113, 21468, 509}
max_facts:  34098
converting global to local entity index ...
100%|██████████████████████████████████████████████████████████████████████| 27631/27631 [00:06<00:00, 4224.13it/
avg local entity:  1297.9829539285586
max local entity:  2001
preparing dep ...
100%|█████████████████████████████████████████████████████████████████████| 27631/27631 [00:02<00:00, 12304.01it/
preparing data ...
100%|███████████████████████████████████████████████████████████████████████| 27631/27631 [01:29<00:00, 307.72it/
27631 cases in total, 0 cases without query entity, 14953 cases with single query entity, 12678 cases with multip query entities
building word index ...
Entity: 2429346, Relation in KB: 6649, Relation in use: 6650
loading data from dataset/CWQ/dev_simple.json
3519it [00:06, 506.21it/s]
skip set()
max_facts:  32496
converting global to local entity index ...
100%|████████████████████████████████████████████████████████████████████████| 3519/3519 [00:00<00:00, 4223.77it/
avg local entity:  1338.1057118499573
max local entity:  2001
preparing dep ...
100%|███████████████████████████████████████████████████████████████████████| 3519/3519 [00:00<00:00, 12308.09it/
preparing data ...
100%|█████████████████████████████████████████████████████████████████████████| 3519/3519 [00:11<00:00, 298.42it/
3519 cases in total, 0 cases without query entity, 1794 cases with single query entity, 1725 cases with multiple ery entities
building word index ...
Entity: 2429346, Relation in KB: 6649, Relation in use: 6650
loading data from dataset/CWQ/test_simple.json
3531it [00:23, 148.57it/s]
skip set()
max_facts:  34098
converting global to local entity index ...
100%|████████████████████████████████████████████████████████████████████████| 3531/3531 [00:00<00:00, 4302.08it/
avg local entity:  1337.5734919286322
max local entity:  2001
preparing dep ...
100%|███████████████████████████████████████████████████████████████████████| 3531/3531 [00:00<00:00, 12662.46it/
preparing data ...
100%|█████████████████████████████████████████████████████████████████████████| 3531/3531 [00:11<00:00, 298.49it/
3531 cases in total, 0 cases without query entity, 1829 cases with single query entity, 1702 cases with multiple ery entities
2022-03-03 14:52:33,432 - root - INFO - Building Agent.
Entity: 2429346, Relation: 6650, Word: 20049
Entity: 2429346, Relation: 6650, Word: 20049
2022-03-03 15:03:42,419 - root - INFO - Architecture: TeacherAgent_hybrid(
  (model): HybridModel(
    (relation_embedding): Embedding(6650, 200)
    (word_embedding): Embedding(20050, 300, padding_idx=20049)
    (entity_linear): Linear(in_features=100, out_features=50, bias=True)
    (relation_linear): Linear(in_features=200, out_features=50, bias=True)
    (lstm_drop): Dropout(p=0.3, inplace=False)
    (linear_drop): Dropout(p=0.2, inplace=False)
    (type_layer): TypeLayer(
      (linear_drop): Dropout(p=0.2, inplace=False)
      (kb_self_linear): Linear(in_features=50, out_features=50, bias=True)
    )
    (kld_loss): KLDivLoss()
    (bce_loss_logits): BCEWithLogitsLoss()
    (mse_loss): MSELoss()
    (instruction): LSTMInstruction(
      (lstm_drop): Dropout(p=0.3, inplace=False)
      (linear_drop): Dropout(p=0.2, inplace=False)
      (word_embedding): Embedding(20050, 300, padding_idx=20049)
      (node_encoder): LSTM(300, 50, batch_first=True)
      (cq_linear): Linear(in_features=100, out_features=50, bias=True)
      (ca_linear): Linear(in_features=50, out_features=1, bias=True)
      (question_linear0): Linear(in_features=50, out_features=50, bias=True)
      (question_linear1): Linear(in_features=50, out_features=50, bias=True)
      (question_linear2): Linear(in_features=50, out_features=50, bias=True)
      (question_linear3): Linear(in_features=50, out_features=50, bias=True)
    )
    (reasoning): GNNReasoning(
      (lstm_drop): Dropout(p=0.3, inplace=False)
      (linear_drop): Dropout(p=0.2, inplace=False)
      (softmax_d1): Softmax(dim=1)
      (score_func): Linear(in_features=50, out_features=1, bias=True)
      (rel_linear0): Linear(in_features=50, out_features=50, bias=True)
      (e2e_linear0): Linear(in_features=100, out_features=50, bias=True)
      (rel_linear1): Linear(in_features=50, out_features=50, bias=True)
      (e2e_linear1): Linear(in_features=100, out_features=50, bias=True)
      (rel_linear2): Linear(in_features=50, out_features=50, bias=True)
      (e2e_linear2): Linear(in_features=100, out_features=50, bias=True)
      (rel_linear3): Linear(in_features=50, out_features=50, bias=True)
      (e2e_linear3): Linear(in_features=100, out_features=50, bias=True)
    )
    (back_reasoning): GNNBackwardReasoning(
      (lstm_drop): Dropout(p=0.3, inplace=False)
      (linear_drop): Dropout(p=0.2, inplace=False)
      (softmax_d1): Softmax(dim=1)
      (score_func): Linear(in_features=50, out_features=1, bias=True)
      (rel_linear0): Linear(in_features=50, out_features=50, bias=True)
      (e2e_linear0): Linear(in_features=100, out_features=50, bias=True)
      (rel_linear1): Linear(in_features=50, out_features=50, bias=True)
      (e2e_linear1): Linear(in_features=100, out_features=50, bias=True)
      (rel_linear2): Linear(in_features=50, out_features=50, bias=True)
      (e2e_linear2): Linear(in_features=100, out_features=50, bias=True)
      (rel_linear3): Linear(in_features=50, out_features=50, bias=True)
      (e2e_linear3): Linear(in_features=100, out_features=50, bias=True)
    )
    (constraint_loss): MSELoss()
    (kld_loss_1): KLDivLoss()
  )
)
2022-03-03 15:03:42,420 - root - INFO - Agent params: 7509253.0
Load ckpt from checkpoint/CWQ_teacher/../pretrain/CWQ_nsm-final.ckpt
Traceback (most recent call last):
  File "main_teacher.py", line 132, in <module>
    main()
  File "main_teacher.py", line 116, in main
    trainer = Trainer_hybrid(args=vars(args), logger=logger)
  File "/home2/xh/PycharmProject/NSM/NSM/train/trainer_hybrid.py", line 45, in __init__
    self.load_pretrain()
  File "/home2/xh/PycharmProject/NSM/NSM/train/trainer_hybrid.py", line 70, in load_pretrain
    self.load_ckpt(ckpt_path)
  File "/home2/xh/PycharmProject/NSM/NSM/train/trainer_hybrid.py", line 185, in load_ckpt
    checkpoint = torch.load(filename)
  File "/home2/xh/.conda/envs/grailqa/lib/python3.6/site-packages/torch/serialization.py", line 419, in load
    f = open(f, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'checkpoint/CWQ_teacher/../pretrain/CWQ_nsm-final.ckpt\r'
2022-03-03 15:04:43,154 - root - INFO - PARAMETER----------
2022-03-03 15:04:43,154 - root - INFO - BATCH_SIZE=20
2022-03-03 15:04:43,154 - root - INFO - CHAR2ID=chars.txt
2022-03-03 15:04:43,154 - root - INFO - CHECKPOINT_DIR=checkpoint/CWQ_student/
2022-03-03 15:04:43,154 - root - INFO - CONSTRAIN_TYPE=js
2022-03-03 15:04:43,154 - root - INFO - DATA_FOLDER=dataset/CWQ/
2022-03-03 15:04:43,154 - root - INFO - DECAY_RATE=0.0
2022-03-03 15:04:43,154 - root - INFO - ENCODE_TYPE=True
2022-03-03 15:04:43,155 - root - INFO - ENCODER_TYPE=lstm
2022-03-03 15:04:43,155 - root - INFO - ENTITY2ID=entities.txt
2022-03-03 15:04:43,155 - root - INFO - ENTITY_DIM=50
2022-03-03 15:04:43,155 - root - INFO - ENTITY_EMB_FILE=None
2022-03-03 15:04:43,155 - root - INFO - ENTITY_KGE_FILE=None
2022-03-03 15:04:43,155 - root - INFO - ENTROPY_WEIGHT=0.0
2022-03-03 15:04:43,155 - root - INFO - EPS=0.95
2022-03-03 15:04:43,155 - root - INFO - EVAL_EVERY=2
2022-03-03 15:04:43,155 - root - INFO - EXPERIMENT_NAME=CWQ_hybrid_student
2022-03-03 15:04:43,155 - root - INFO - FACT_DROP=0
2022-03-03 15:04:43,155 - root - INFO - FACT_SCALE=3
2022-03-03 15:04:43,155 - root - INFO - FILTER_LABEL=False
2022-03-03 15:04:43,155 - root - INFO - FILTER_SUB=False
2022-03-03 15:04:43,155 - root - INFO - GRADIENT_CLIP=1.0
2022-03-03 15:04:43,155 - root - INFO - IS_EVAL=False
2022-03-03 15:04:43,155 - root - INFO - KG_DIM=100
2022-03-03 15:04:43,155 - root - INFO - KGE_DIM=100
2022-03-03 15:04:43,155 - root - INFO - LABEL_F1=0.5
2022-03-03 15:04:43,155 - root - INFO - LABEL_FILE=None
2022-03-03 15:04:43,156 - root - INFO - LABEL_SMOOTH=0.1
2022-03-03 15:04:43,156 - root - INFO - LAMBDA_BACK=0.01
2022-03-03 15:04:43,156 - root - INFO - LAMBDA_CONSTRAIN=0.1
2022-03-03 15:04:43,156 - root - INFO - LAMBDA_LABEL=0.05
2022-03-03 15:04:43,156 - root - INFO - LINEAR_DROPOUT=0.2
2022-03-03 15:04:43,156 - root - INFO - LOAD_CKPT_FILE=None
2022-03-03 15:04:43,156 - root - INFO - LOAD_EXPERIMENT=None
2022-03-03 15:04:43,156 - root - INFO - LOAD_TEACHER=../CWQ_teacher/CWQ_hybrid_teacher-final.ckpt
2022-03-03 15:04:43,156 - root - INFO - LOG_LEVEL=info
2022-03-03 15:04:43,156 - root - INFO - LOSS_TYPE=kl
2022-03-03 15:04:43,156 - root - INFO - LR=0.0005
2022-03-03 15:04:43,156 - root - INFO - LR_SCHEDULE=False
2022-03-03 15:04:43,156 - root - INFO - LSTM_DROPOUT=0.3
2022-03-03 15:04:43,156 - root - INFO - MODE=teacher
2022-03-03 15:04:43,156 - root - INFO - MODEL_NAME=gnn
2022-03-03 15:04:43,156 - root - INFO - NAME=webqsp
2022-03-03 15:04:43,156 - root - INFO - NUM_EPOCH=100
2022-03-03 15:04:43,156 - root - INFO - NUM_LAYER=1
2022-03-03 15:04:43,157 - root - INFO - NUM_STEP=4
2022-03-03 15:04:43,157 - root - INFO - PRETRAINED_ENTITY_KGE_FILE=entity_emb_100d.npy
2022-03-03 15:04:43,157 - root - INFO - Q_TYPE=seq
2022-03-03 15:04:43,157 - root - INFO - REASON_KB=True
2022-03-03 15:04:43,157 - root - INFO - REL_WORD_IDS=rel_word_idx.npy
2022-03-03 15:04:43,157 - root - INFO - RELATION2ID=relations.txt
2022-03-03 15:04:43,157 - root - INFO - RELATION_EMB_FILE=None
2022-03-03 15:04:43,157 - root - INFO - RELATION_KGE_FILE=None
2022-03-03 15:04:43,157 - root - INFO - SEED=19960626
2022-03-03 15:04:43,157 - root - INFO - SHARE_EMBEDDING=False
2022-03-03 15:04:43,157 - root - INFO - SHARE_ENCODER=False
2022-03-03 15:04:43,157 - root - INFO - SHARE_INSTRUCTION=False
2022-03-03 15:04:43,157 - root - INFO - TEACHER_MODEL=gnn
2022-03-03 15:04:43,157 - root - INFO - TEACHER_TYPE=hybrid
2022-03-03 15:04:43,157 - root - INFO - TEST_BATCH_SIZE=40
2022-03-03 15:04:43,157 - root - INFO - TRAIN_KL=False
2022-03-03 15:04:43,157 - root - INFO - TREE_SOFT=False
2022-03-03 15:04:43,157 - root - INFO - USE_CUDA=True
2022-03-03 15:04:43,157 - root - INFO - USE_INVERSE_RELATION=False
2022-03-03 15:04:43,158 - root - INFO - USE_LABEL=False
2022-03-03 15:04:43,158 - root - INFO - USE_SELF_LOOP=True
2022-03-03 15:04:43,158 - root - INFO - WORD2ID=vocab_new.txt
2022-03-03 15:04:43,158 - root - INFO - WORD_DIM=300
2022-03-03 15:04:43,158 - root - INFO - WORD_EMB_FILE=word_emb_300d.npy
2022-03-03 15:04:43,158 - root - INFO - -------------------
building word index ...
Entity: 2429346, Relation in KB: 6649, Relation in use: 6650
loading data from dataset/CWQ/train_simple.json
27639it [01:52, 245.21it/s]
skip {13194, 9931, 17485, 10670, 17373, 1113, 21468, 509}
max_facts:  34098
converting global to local entity index ...
100%|█████████████████████████████████████████████████████████████████████████████████████| 27631/27631 [00:06<00
avg local entity:  1297.9829539285586
max local entity:  2001
preparing dep ...
100%|████████████████████████████████████████████████████████████████████████████████████| 27631/27631 [00:02<00:
preparing data ...
100%|██████████████████████████████████████████████████████████████████████████████████████| 27631/27631 [01:30<0
27631 cases in total, 0 cases without query entity, 14953 cases with single query entity, 12678 cases with multip
building word index ...
Entity: 2429346, Relation in KB: 6649, Relation in use: 6650
loading data from dataset/CWQ/dev_simple.json
3519it [00:07, 479.39it/s]
skip set()
max_facts:  32496
converting global to local entity index ...
100%|███████████████████████████████████████████████████████████████████████████████████████| 3519/3519 [00:00<00
avg local entity:  1338.1057118499573
max local entity:  2001
preparing dep ...
100%|██████████████████████████████████████████████████████████████████████████████████████| 3519/3519 [00:00<00:
preparing data ...
100%|████████████████████████████████████████████████████████████████████████████████████████| 3519/3519 [00:11<0
3519 cases in total, 0 cases without query entity, 1794 cases with single query entity, 1725 cases with multiple
building word index ...
Entity: 2429346, Relation in KB: 6649, Relation in use: 6650
loading data from dataset/CWQ/test_simple.json
3531it [00:25, 136.89it/s]
skip set()
max_facts:  34098
converting global to local entity index ...
100%|███████████████████████████████████████████████████████████████████████████████████████| 3531/3531 [00:00<00
avg local entity:  1337.5734919286322
max local entity:  2001
preparing dep ...
100%|██████████████████████████████████████████████████████████████████████████████████████| 3531/3531 [00:00<00:
preparing data ...
100%|████████████████████████████████████████████████████████████████████████████████████████| 3531/3531 [00:11<0
3531 cases in total, 0 cases without query entity, 1829 cases with single query entity, 1702 cases with multiple
2022-03-03 15:09:16,247 - root - INFO - Building Agent.
Entity: 2429346, Relation: 6650, Word: 20049
Entity: 2429346, Relation: 6650, Word: 20049
Traceback (most recent call last):
  File "main_student.py", line 128, in <module>
    main()
  File "main_student.py", line 114, in main
    trainer = Trainer_KBQA(args=vars(args), logger=logger)
  File "/home2/xh/PycharmProject/NSM/NSM/train/trainer_student.py", line 49, in __init__
    len(self.word2id))
  File "/home2/xh/PycharmProject/NSM/NSM/train/init.py", line 34, in init_hybrid
    agent = TeacherAgent_hybrid(args, logger, num_entity, num_relation, num_word)
  File "/home2/xh/PycharmProject/NSM/NSM/Agent/TeacherAgent.py", line 20, in __init__
    self.model = HybridModel(args, num_entity, num_relation, num_word)
  File "/home2/xh/PycharmProject/NSM/NSM/Model/hybrid_model.py", line 32, in __init__
    self.to(self.device)
  File "/home2/xh/.conda/envs/grailqa/lib/python3.6/site-packages/torch/nn/modules/module.py", line 426, in t
    return self._apply(convert)
  File "/home2/xh/.conda/envs/grailqa/lib/python3.6/site-packages/torch/nn/modules/module.py", line 202, in _
    module._apply(fn)
  File "/home2/xh/.conda/envs/grailqa/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in _
    param_applied = fn(param)
  File "/home2/xh/.conda/envs/grailqa/lib/python3.6/site-packages/torch/nn/modules/module.py", line 424, in c
    return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
RuntimeError: CUDA error: out of memory
run_CWQ.sh: line 7: $'\r': command not found
2022-03-03 15:23:33,666 - root - INFO - PARAMETER----------
2022-03-03 15:23:33,666 - root - INFO - BATCH_SIZE=20
2022-03-03 15:23:33,666 - root - INFO - CHAR2ID=chars.txt
2022-03-03 15:23:33,666 - root - INFO - CHECKPOINT_DIR=checkpoint/CWQ_teacher/
2022-03-03 15:23:33,666 - root - INFO - CONSTRAIN_TYPE=js
2022-03-03 15:23:33,666 - root - INFO - DATA_FOLDER=dataset/CWQ/
2022-03-03 15:23:33,666 - root - INFO - DECAY_RATE=0.0
2022-03-03 15:23:33,666 - root - INFO - ENCODE_TYPE=True
2022-03-03 15:23:33,666 - root - INFO - ENCODER_TYPE=lstm
2022-03-03 15:23:33,666 - root - INFO - ENTITY2ID=entities.txt
2022-03-03 15:23:33,666 - root - INFO - ENTITY_DIM=50
2022-03-03 15:23:33,667 - root - INFO - ENTITY_EMB_FILE=None
2022-03-03 15:23:33,667 - root - INFO - ENTITY_KGE_FILE=None
2022-03-03 15:23:33,667 - root - INFO - ENTROPY_WEIGHT=0.0
2022-03-03 15:23:33,667 - root - INFO - EPS=0.95
2022-03-03 15:23:33,667 - root - INFO - EVAL_EVERY=2
2022-03-03 15:23:33,667 - root - INFO - EXPERIMENT_NAME=CWQ_parallel_teacher
2022-03-03 15:23:33,667 - root - INFO - FACT_DROP=0
2022-03-03 15:23:33,667 - root - INFO - FACT_SCALE=3
2022-03-03 15:23:33,667 - root - INFO - FILTER_LABEL=False
2022-03-03 15:23:33,667 - root - INFO - FILTER_SUB=False
2022-03-03 15:23:33,667 - root - INFO - GRADIENT_CLIP=1.0
2022-03-03 15:23:33,667 - root - INFO - IS_EVAL=False
2022-03-03 15:23:33,667 - root - INFO - KG_DIM=100
2022-03-03 15:23:33,667 - root - INFO - KGE_DIM=100
2022-03-03 15:23:33,667 - root - INFO - LABEL_F1=0.5
2022-03-03 15:23:33,667 - root - INFO - LABEL_FILE=None
2022-03-03 15:23:33,668 - root - INFO - LABEL_SMOOTH=0.1
2022-03-03 15:23:33,668 - root - INFO - LAMBDA_BACK=0.1
2022-03-03 15:23:33,668 - root - INFO - LAMBDA_CONSTRAIN=0.01
2022-03-03 15:23:33,668 - root - INFO - LAMBDA_LABEL=0.01
2022-03-03 15:23:33,668 - root - INFO - LINEAR_DROPOUT=0.2
2022-03-03 15:23:33,668 - root - INFO - LOAD_EXPERIMENT=None
2022-03-03 15:23:33,668 - root - INFO - LOAD_PRETRAIN=../pretrain/CWQ_nsm-final.ckpt
2022-03-03 15:23:33,668 - root - INFO - LOG_LEVEL=info
2022-03-03 15:23:33,668 - root - INFO - LOSS_TYPE=kl
2022-03-03 15:23:33,668 - root - INFO - LR=0.0005
2022-03-03 15:23:33,668 - root - INFO - LR_SCHEDULE=False
2022-03-03 15:23:33,668 - root - INFO - LSTM_DROPOUT=0.3
2022-03-03 15:23:33,668 - root - INFO - MODE=teacher
2022-03-03 15:23:33,668 - root - INFO - MODEL_NAME=gnn
2022-03-03 15:23:33,668 - root - INFO - NAME=webqsp
2022-03-03 15:23:33,669 - root - INFO - NUM_EPOCH=30
2022-03-03 15:23:33,669 - root - INFO - NUM_LAYER=1
2022-03-03 15:23:33,669 - root - INFO - NUM_STEP=4
2022-03-03 15:23:33,669 - root - INFO - PRETRAINED_ENTITY_KGE_FILE=entity_emb_100d.npy
2022-03-03 15:23:33,669 - root - INFO - Q_TYPE=seq
2022-03-03 15:23:33,669 - root - INFO - REASON_KB=True
2022-03-03 15:23:33,669 - root - INFO - REL_WORD_IDS=rel_word_idx.npy
2022-03-03 15:23:33,669 - root - INFO - RELATION2ID=relations.txt
2022-03-03 15:23:33,669 - root - INFO - RELATION_EMB_FILE=None
2022-03-03 15:23:33,669 - root - INFO - RELATION_KGE_FILE=None
2022-03-03 15:23:33,669 - root - INFO - SEED=19960626
2022-03-03 15:23:33,669 - root - INFO - SHARE_EMBEDDING=False
2022-03-03 15:23:33,669 - root - INFO - SHARE_ENCODER=False
2022-03-03 15:23:33,669 - root - INFO - SHARE_INSTRUCTION=False
2022-03-03 15:23:33,669 - root - INFO - TEACHER_TYPE=parallel
2022-03-03 15:23:33,669 - root - INFO - TEST_BATCH_SIZE=40
2022-03-03 15:23:33,670 - root - INFO - TRAIN_KL=False
2022-03-03 15:23:33,670 - root - INFO - TREE_SOFT=False
2022-03-03 15:23:33,670 - root - INFO - USE_CUDA=True
2022-03-03 15:23:33,670 - root - INFO - USE_INVERSE_RELATION=False
2022-03-03 15:23:33,670 - root - INFO - USE_LABEL=False
2022-03-03 15:23:33,670 - root - INFO - USE_SELF_LOOP=True
2022-03-03 15:23:33,670 - root - INFO - WORD2ID=vocab_new.txt
2022-03-03 15:23:33,670 - root - INFO - WORD_DIM=300
2022-03-03 15:23:33,670 - root - INFO - WORD_EMB_FILE=word_emb_300d.npy
2022-03-03 15:23:33,670 - root - INFO - -------------------
building word index ...
Entity: 2429346, Relation in KB: 6649, Relation in use: 6650
loading data from dataset/CWQ/train_simple.json
27639it [01:46, 259.64it/s]
skip {13194, 9931, 17485, 10670, 17373, 1113, 21468, 509}
max_facts:  34098
converting global to local entity index ...
100%|█████████████████████████████████████████████████████████████████████████████████████| 27631/27631 [00:06<00
avg local entity:  1297.9829539285586
max local entity:  2001
preparing dep ...
100%|████████████████████████████████████████████████████████████████████████████████████| 27631/27631 [00:02<00:
preparing data ...
100%|██████████████████████████████████████████████████████████████████████████████████████| 27631/27631 [01:33<0
27631 cases in total, 0 cases without query entity, 14953 cases with single query entity, 12678 cases with multip
building word index ...
Entity: 2429346, Relation in KB: 6649, Relation in use: 6650
loading data from dataset/CWQ/dev_simple.json
3519it [00:07, 498.71it/s]
skip set()
max_facts:  32496
converting global to local entity index ...
100%|███████████████████████████████████████████████████████████████████████████████████████| 3519/3519 [00:00<00
avg local entity:  1338.1057118499573
max local entity:  2001
preparing dep ...
100%|██████████████████████████████████████████████████████████████████████████████████████| 3519/3519 [00:00<00:
preparing data ...
100%|████████████████████████████████████████████████████████████████████████████████████████| 3519/3519 [00:11<0
3519 cases in total, 0 cases without query entity, 1794 cases with single query entity, 1725 cases with multiple
building word index ...
Entity: 2429346, Relation in KB: 6649, Relation in use: 6650
loading data from dataset/CWQ/test_simple.json
3531it [00:24, 145.61it/s]
skip set()
max_facts:  34098
converting global to local entity index ...
100%|███████████████████████████████████████████████████████████████████████████████████████| 3531/3531 [00:00<00
avg local entity:  1337.5734919286322
max local entity:  2001
preparing dep ...
100%|██████████████████████████████████████████████████████████████████████████████████████| 3531/3531 [00:00<00:
preparing data ...
100%|████████████████████████████████████████████████████████████████████████████████████████| 3531/3531 [00:11<0
3531 cases in total, 0 cases without query entity, 1829 cases with single query entity, 1702 cases with multiple
2022-03-03 15:28:02,001 - root - INFO - Building Agent.
Entity: 2429346, Relation: 6650, Word: 20049
Entity: 2429346, Relation: 6650, Word: 20049
Entity: 2429346, Relation: 6650, Word: 20049
Traceback (most recent call last):
  File "main_teacher.py", line 132, in <module>
    main()
  File "main_teacher.py", line 114, in main
    trainer = Trainer_parallel(args=vars(args), logger=logger)
  File "/home2/xh/PycharmProject/NSM/NSM/train/trainer_parallel.py", line 41, in __init__
    len(self.word2id))
  File "/home2/xh/PycharmProject/NSM/NSM/train/init.py", line 23, in init_parallel
    agent = TeacherAgent_parallel(args, logger, num_entity, num_relation, num_word)
  File "/home2/xh/PycharmProject/NSM/NSM/Agent/TeacherAgent2.py", line 20, in __init__
    self.back_model = BackwardReasonModel(args, num_entity, num_relation, num_word, self.model)
  File "/home2/xh/PycharmProject/NSM/NSM/Model/backward_model.py", line 33, in __init__
    self.to(self.device)
  File "/home2/xh/.conda/envs/grailqa/lib/python3.6/site-packages/torch/nn/modules/module.py", line 426, in t
    return self._apply(convert)
  File "/home2/xh/.conda/envs/grailqa/lib/python3.6/site-packages/torch/nn/modules/module.py", line 202, in _
    module._apply(fn)
  File "/home2/xh/.conda/envs/grailqa/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in _
    param_applied = fn(param)
  File "/home2/xh/.conda/envs/grailqa/lib/python3.6/site-packages/torch/nn/modules/module.py", line 424, in c
    return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
RuntimeError: CUDA out of memory. Tried to allocate 24.00 MiB (GPU 0; 23.70 GiB total capacity; 33.62 MiB alreadyMiB free; 12.38 MiB cached)
2022-03-03 15:41:59,677 - root - INFO - PARAMETER----------
2022-03-03 15:41:59,677 - root - INFO - BATCH_SIZE=20
2022-03-03 15:41:59,677 - root - INFO - CHAR2ID=chars.txt
2022-03-03 15:41:59,678 - root - INFO - CHECKPOINT_DIR=checkpoint/CWQ_student/
2022-03-03 15:41:59,678 - root - INFO - CONSTRAIN_TYPE=js
2022-03-03 15:41:59,678 - root - INFO - DATA_FOLDER=dataset/CWQ/
2022-03-03 15:41:59,678 - root - INFO - DECAY_RATE=0.0
2022-03-03 15:41:59,678 - root - INFO - ENCODE_TYPE=True
2022-03-03 15:41:59,678 - root - INFO - ENCODER_TYPE=lstm
2022-03-03 15:41:59,678 - root - INFO - ENTITY2ID=entities.txt
2022-03-03 15:41:59,678 - root - INFO - ENTITY_DIM=50
2022-03-03 15:41:59,678 - root - INFO - ENTITY_EMB_FILE=None
2022-03-03 15:41:59,678 - root - INFO - ENTITY_KGE_FILE=None
2022-03-03 15:41:59,678 - root - INFO - ENTROPY_WEIGHT=0.0
2022-03-03 15:41:59,678 - root - INFO - EPS=0.95
2022-03-03 15:41:59,679 - root - INFO - EVAL_EVERY=2
2022-03-03 15:41:59,679 - root - INFO - EXPERIMENT_NAME=CWQ_parallel_student
2022-03-03 15:41:59,679 - root - INFO - FACT_DROP=0
2022-03-03 15:41:59,679 - root - INFO - FACT_SCALE=3
2022-03-03 15:41:59,679 - root - INFO - FILTER_LABEL=False
2022-03-03 15:41:59,679 - root - INFO - FILTER_SUB=False
2022-03-03 15:41:59,679 - root - INFO - GRADIENT_CLIP=1.0
2022-03-03 15:41:59,679 - root - INFO - IS_EVAL=False
2022-03-03 15:41:59,679 - root - INFO - KG_DIM=100
2022-03-03 15:41:59,679 - root - INFO - KGE_DIM=100
2022-03-03 15:41:59,679 - root - INFO - LABEL_F1=0.5
2022-03-03 15:41:59,679 - root - INFO - LABEL_FILE=None
2022-03-03 15:41:59,679 - root - INFO - LABEL_SMOOTH=0.1
2022-03-03 15:41:59,680 - root - INFO - LAMBDA_BACK=0.01
2022-03-03 15:41:59,680 - root - INFO - LAMBDA_CONSTRAIN=0.1
2022-03-03 15:41:59,680 - root - INFO - LAMBDA_LABEL=0.05
2022-03-03 15:41:59,680 - root - INFO - LINEAR_DROPOUT=0.2
2022-03-03 15:41:59,680 - root - INFO - LOAD_CKPT_FILE=None
2022-03-03 15:41:59,680 - root - INFO - LOAD_EXPERIMENT=None
2022-03-03 15:41:59,680 - root - INFO - LOAD_TEACHER=../CWQ_teacher/CWQ_parallel_teacher-final.ckpt
2022-03-03 15:41:59,680 - root - INFO - LOG_LEVEL=info
2022-03-03 15:41:59,680 - root - INFO - LOSS_TYPE=kl
2022-03-03 15:41:59,680 - root - INFO - LR=0.0005
2022-03-03 15:41:59,680 - root - INFO - LR_SCHEDULE=False
2022-03-03 15:41:59,680 - root - INFO - LSTM_DROPOUT=0.3
2022-03-03 15:41:59,680 - root - INFO - MODE=teacher
2022-03-03 15:41:59,681 - root - INFO - MODEL_NAME=gnn
2022-03-03 15:41:59,681 - root - INFO - NAME=webqsp
2022-03-03 15:41:59,681 - root - INFO - NUM_EPOCH=100
2022-03-03 15:41:59,681 - root - INFO - NUM_LAYER=1
2022-03-03 15:41:59,681 - root - INFO - NUM_STEP=4
2022-03-03 15:41:59,681 - root - INFO - PRETRAINED_ENTITY_KGE_FILE=entity_emb_100d.npy
2022-03-03 15:41:59,681 - root - INFO - Q_TYPE=seq
2022-03-03 15:41:59,681 - root - INFO - REASON_KB=True
2022-03-03 15:41:59,681 - root - INFO - REL_WORD_IDS=rel_word_idx.npy
2022-03-03 15:41:59,681 - root - INFO - RELATION2ID=relations.txt
2022-03-03 15:41:59,681 - root - INFO - RELATION_EMB_FILE=None
2022-03-03 15:41:59,682 - root - INFO - RELATION_KGE_FILE=None
2022-03-03 15:41:59,682 - root - INFO - SEED=19960626
2022-03-03 15:41:59,682 - root - INFO - SHARE_EMBEDDING=False
2022-03-03 15:41:59,682 - root - INFO - SHARE_ENCODER=False
2022-03-03 15:41:59,682 - root - INFO - SHARE_INSTRUCTION=False
2022-03-03 15:41:59,682 - root - INFO - TEACHER_MODEL=gnn
2022-03-03 15:41:59,682 - root - INFO - TEACHER_TYPE=parallel
2022-03-03 15:41:59,682 - root - INFO - TEST_BATCH_SIZE=40
2022-03-03 15:41:59,682 - root - INFO - TRAIN_KL=False
2022-03-03 15:41:59,682 - root - INFO - TREE_SOFT=False
2022-03-03 15:41:59,682 - root - INFO - USE_CUDA=True
2022-03-03 15:41:59,682 - root - INFO - USE_INVERSE_RELATION=False
2022-03-03 15:41:59,682 - root - INFO - USE_LABEL=False
2022-03-03 15:41:59,682 - root - INFO - USE_SELF_LOOP=True
2022-03-03 15:41:59,682 - root - INFO - WORD2ID=vocab_new.txt
2022-03-03 15:41:59,682 - root - INFO - WORD_DIM=300
2022-03-03 15:41:59,682 - root - INFO - WORD_EMB_FILE=word_emb_300d.npy
2022-03-03 15:41:59,682 - root - INFO - -------------------
building word index ...
Entity: 2429346, Relation in KB: 6649, Relation in use: 6650
loading data from dataset/CWQ/train_simple.json
27639it [02:06, 218.35it/s]
skip {13194, 9931, 17485, 10670, 17373, 1113, 21468, 509}
max_facts:  34098
converting global to local entity index ...
100%|█████████████████████████████████████████████████████████████████████████████████████| 27631/27631 [00:06<00
avg local entity:  1297.9829539285586
max local entity:  2001
preparing dep ...
100%|████████████████████████████████████████████████████████████████████████████████████| 27631/27631 [00:02<00:
preparing data ...
 97%|███████████████████████████████████████████████████████████████████████████████████▊  | 26937/27631 [01:42<0 98%|███████████████████████████████████████████████████████████████████████████████████▉  | 26971/27631 [01:42<0 98%|████████████████████████████████████████████████████████████████████████████████████  | 26999/27631 [01:42<0 98%|████████████████████████████████████████████████████████████████████████████████████▏ | 27029/27631 [01:42<0 98%|████████████████████████████████████████████████████████████████████████████████████▏ | 27057/27631 [01:42<0 98%|████████████████████████████████████████████████████████████████████████████████████▎ | 27083/27631 [01:42<0 98%|████████████████████████████████████████████████████████████████████████████████████▎ | 27108/27631 [01:42<0 98%|████████████████████████████████████████████████████████████████████████████████████▍ | 27131/27631 [01:43<0 98%|████████████████████████████████████████████████████████████████████████████████████▌ | 27152/27631 [01:43<0 98%|████████████████████████████████████████████████████████████████████████████████████▌ | 27173/27631 [01:43<0 98%|████████████████████████████████████████████████████████████████████████████████████▋ | 27195/27631 [01:43<0 98%|████████████████████████████████████████████████████████████████████████████████████▋ | 27215/27631 [01:43<0 99%|████████████████████████████████████████████████████████████████████████████████████▊ | 27235/27631 [01:43<0 99%|████████████████████████████████████████████████████████████████████████████████████▊ | 27257/27631 [01:43<0 99%|████████████████████████████████████████████████████████████████████████████████████▉ | 27277/27631 [01:43<0 99%|████████████████████████████████████████████████████████████████████████████████████▉ | 27302/27631 [01:43<0 99%|█████████████████████████████████████████████████████████████████████████████████████ | 27323/27631 [01:44<0 99%|█████████████████████████████████████████████████████████████████████████████████████ | 27342/27631 [01:44<0 99%|█████████████████████████████████████████████████████████████████████████████████████▏| 27360/27631 [01:44<0 99%|█████████████████████████████████████████████████████████████████████████████████████▏| 27376/27631 [01:44<0 99%|█████████████████████████████████████████████████████████████████████████████████████▎| 27392/27631 [01:44<0 99%|█████████████████████████████████████████████████████████████████████████████████████▎| 27407/27631 [01:44<0 99%|█████████████████████████████████████████████████████████████████████████████████████▍| 27434/27631 [01:44<0 99%|█████████████████████████████████████████████████████████████████████████████████████▍| 27465/27631 [01:44<0100%|█████████████████████████████████████████████████████████████████████████████████████▌| 27499/27631 [01:44<0100%|█████████████████████████████████████████████████████████████████████████████████████▋| 27525/27631 [01:45<0100%|█████████████████████████████████████████████████████████████████████████████████████▊| 27557/27631 [01:45<0100%|█████████████████████████████████████████████████████████████████████████████████████▊| 27587/27631 [01:45<0100%|█████████████████████████████████████████████████████████████████████████████████████▉| 27616/27631 [01:45<0100%|██████████████████████████████████████████████████████████████████████████████████████| 27631/27631 [01:45<062.00it/s]
27631 cases in total, 0 cases without query entity, 14953 cases with single query entity, 12678 cases with multipy entities
building word index ...
Entity: 2429346, Relation in KB: 6649, Relation in use: 6650
loading data from dataset/CWQ/dev_simple.json
3519it [00:08, 411.36it/s]
skip set()
max_facts:  32496
converting global to local entity index ...
100%|█████████████████████████████████████████████████████████████████████████████| 3519/3519 [00:01<00:00, 2777.
avg local entity:  1338.1057118499573
max local entity:  2001
preparing dep ...
100%|█████████████████████████████████████████████████████████████████████████████| 3519/3519 [00:00<00:00, 9096.
preparing data ...
100%|██████████████████████████████████████████████████████████████████████████████| 3519/3519 [00:15<00:00, 220.
3519 cases in total, 0 cases without query entity, 1794 cases with single query entity, 1725 cases with multiple ntities
building word index ...
Entity: 2429346, Relation in KB: 6649, Relation in use: 6650
loading data from dataset/CWQ/test_simple.json
3531it [00:28, 124.11it/s]
skip set()
max_facts:  34098
converting global to local entity index ...
100%|█████████████████████████████████████████████████████████████████████████████| 3531/3531 [00:00<00:00, 4115.
avg local entity:  1337.5734919286322
max local entity:  2001
preparing dep ...
100%|████████████████████████████████████████████████████████████████████████████| 3531/3531 [00:00<00:00, 12829.
preparing data ...
100%|██████████████████████████████████████████████████████████████████████████████| 3531/3531 [00:16<00:00, 217.
3531 cases in total, 0 cases without query entity, 1829 cases with single query entity, 1702 cases with multiple ntities
2022-03-03 15:47:15,270 - root - INFO - Building Agent.
Entity: 2429346, Relation: 6650, Word: 20049
Entity: 2429346, Relation: 6650, Word: 20049
Traceback (most recent call last):
  File "main_student.py", line 128, in <module>
    main()
  File "main_student.py", line 114, in main
    trainer = Trainer_KBQA(args=vars(args), logger=logger)
  File "/home2/xh/PycharmProject/NSM/NSM/train/trainer_student.py", line 46, in __init__
    len(self.word2id))
  File "/home2/xh/PycharmProject/NSM/NSM/train/init.py", line 23, in init_parallel
    agent = TeacherAgent_parallel(args, logger, num_entity, num_relation, num_word)
  File "/home2/xh/PycharmProject/NSM/NSM/Agent/TeacherAgent2.py", line 19, in __init__
    self.model = ForwardReasonModel(args, num_entity, num_relation, num_word)
  File "/home2/xh/PycharmProject/NSM/NSM/Model/forward_model.py", line 33, in __init__
    self.to(self.device)
  File "/home2/xh/.conda/envs/grailqa/lib/python3.6/site-packages/torch/nn/modules/module.py", line 426, in t
    return self._apply(convert)
  File "/home2/xh/.conda/envs/grailqa/lib/python3.6/site-packages/torch/nn/modules/module.py", line 202, in _
    module._apply(fn)
  File "/home2/xh/.conda/envs/grailqa/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in _
    param_applied = fn(param)
  File "/home2/xh/.conda/envs/grailqa/lib/python3.6/site-packages/torch/nn/modules/module.py", line 424, in c
    return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
RuntimeError: CUDA error: out of memory

Performance of the teacher network

Hi, I recently read your work and I'm very excited by your approach. Thanks for the awesome work!

I had some queries about your set-up: The teacher network also has a forward operation that can be used to solve the KBQA task i.e. it predicts an answer distribution.

  1. Would it be possible to report the results on just the teacher network on these datasets?
  2. If the teacher performs worse than the student, how can we explain the difference in performance? Is it that the teacher weights consistency of forward and backward reasoning more than trying to solve the actual task?
  3. If the teacher works worse, how would hyperparameter tuning of teacher network affect the performance of both the teacher and student?

If you prefer some other medium of communication, please let me know. Thanks!

Some question about the code

What do the reason_layer do in the GNN_reasoning.py file? What is the meaning of the variable: fact_rel, fact_prior, fact_value, fact_query, and neighbor_rep?
I refer the implement of Graft-net, but I still don't understand. Thank you for your reply.

The number of reasoning steps

could u please explain what K is in the paper?
It is mentioned as the k-th reasoning step. how many reasoning steps are there? Is it a hyper-parameter? It depends on the number of hops of each question or the length of query(the number of hidden states of LSTM) or something else? may it be 2 (forward reasoning and backward reasoning )?
Also, I greatly appreciate it if u clarify how the answer is selected in a simple description. The student part is using the supervision signals to find the answer. It applies the forward reasoning and tries to reduce the loss function and then?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.