spico197 / docee Goto Github PK

View Code? Open in Web Editor NEW

231.0 6.0 36.0 35.96 MB

🕹️ A toolkit for document-level event extraction, containing some SOTA model implementations.

Home Page: https://doc-ee.readthedocs.io/

License: MIT License

Python 94.35% Makefile 0.05% HTML 0.83% Shell 4.77%

event-extraction information-extraction natural-language-understanding pytorch

docee's Issues

为什么`ner_token_labels` 里面没有包含扩充的OtherType的实体？

** Problems **
请问为什么在NER模型训练部分输入进模型的ner_token_labels 里面没有论文中提到扩充的Money, Time等实体？

我发现在这里会对entity label 进行in的判断，判断基于的dict来自于 DEEExample。但是这个list里面没有B-OtherType 和 I-OtherType.

wikievents 等英文数据集实验

准备实验个英文数据集不知道作者是否在wikievents 上面跑出结果因为看 scripts 里面的预训练模型名称都是中文的 ~~~

关于标签的统计

** Idea sharing **
两个事件之间在剪枝后，可能之前还存在关系，是否可以挖掘出之前两个事件的关系。因为两个事件团不是互斥的关系。

** Problems **
有没有关于role和eventType的所有标签模板呢（也就是图中的name和fields里的标签）？我想看一下分类个数，以及能不能拓展。

** Others **
无

Agreement

Fill the space in brackets with x to check the agreement items.
Before submitting this issue, I've fully checked the instructions in README.md.
Before submitting this issue, I'd searched in the issue area and didn't find a solved issue that covers my problem.
This issue is about the toolkit itself, not Python, pip or other programming basics.
I understand if I do not check all the agreemnt items above, my issue MAY BE CLOSED OR REMOVED WITHOUT FURTHER EXPLANATIONS.

Problem

我想单独看这个算法的相关的部分内容不看其他的是否有历史的分支项目代码

Environment

Environment	Values
System	Windows/Linux
GPU Device
CUDA Version
Python Version
PyTorch Version
dee (the Toolkit) Version

Full Log

Log:

关于“Before running any bash script, please ensure has been correctly set.bert_model”

你好老师，我按照您说的纠正了一些问题，很高兴项目现在已经可以运行了。但是还有一些小问题实在无法解决需要向您请教，下面我将陈述我的问题。

您readme中“Before running any bash script, please ensure has been correctly set.bert_model”所指的bert模型是Google官方开源的中文模型吗（https://github.com/google-research/bert），？
由于我的运行结果中分词存在问题（见图1），所有的role都只有一个字或者标点，所以我怀疑是bert没有导入的结果，因为我并没有修改您tump中的task_setting.json "bert_model": "bert-base-chinese"，所以我的怀疑合理吗？
图1

Agreement

Fill the space in brackets with x to check the agreement items.
Before submitting this issue, I've fully checked the instructions in README.md.
Before submitting this issue, I'd searched in the issue area and didn't find a solved issue that covers my problem.
This issue is about the toolkit itself, not Python, pip or other programming basics.
I understand if I do not check all the agreemnt items above, my issue MAY BE CLOSED OR REMOVED WITHOUT FURTHER EXPLANATIONS.

Problem

My problem is ...

You can reproduce the problem by ...

老师，我运行程序发现有些实体属性识别不出来（比如公司名识别出来了，但是法院裁定时间识别不出来，而且试了很多文本都出现这种问题），请问这可能是什么问题呢？上面我附带了输出的一些我认为有可能的问题，期待您的解答~

I have tried ..., but it goes to ...

I have checked the source codes, and the problem may come from ...

Environment

Environment	Values
System	Windows/Linux
GPU Device
CUDA Version
Python Version
PyTorch Version
dee (the Toolkit) Version

Full Log

Log:

关于event_table.py

作者您好，想问一下readme中提到的event_table.py文件是在那个文件夹中，好像没有找到

How to generate "dueefin_PTPCG_P1R1_wTgg.json" or Where is "dueefin_PTPCG_P1R1_wTgg.json" or What's its format

Agreement

Fill the space in brackets with x to check the agreement items.
Before submitting this issue, I've fully checked the instructions in README.md.
[xx] Before submitting this issue, I'd searched in the issue area and didn't find a solved issue that covers my problem.
This issue is about the toolkit itself, not Python, pip or other programming basics.
I understand if I do not check all the agreemnt items above, my issue MAY BE CLOSED OR REMOVED WITHOUT FURTHER EXPLANATIONS.

Problem

When I train the code about dueefin, I can't find dueefin_PTPCG_P1R1_wTgg.json file. My question is what the title says. Thanks for reading! Looking forward to your reply!

一些问题

老师，您好。初入事件抽取这一领域，或许是处于零基础阶段。
有些文件看不懂具体实现什么样的功能：

可以麻烦老师给予一些解答吗

我运行bash scripts/run_git_dueefin_withtgg.sh命令时候，报错：读不到json格式对应的文件，然后Output文件里面生成的是pkl格式的文件。老师您在生成这个文件时候在哪里呀，，

How can PTPCG be used for news event extraction

Hi,I will use PTPCG for news event extraction. The data format has been processed into PTPCG input format. And I have used trigger.py to get the importance score of psudo trigger selection. Do I need to modify other places besides? like utils.py of dee folder. And what other details should I pay attention to....
Thanks for reading! I am looking forward to your replying!

老师您好有一些使用的方法请教

我按照您说的哪个方法解决dee导入问题，可是还是出现了下面的错误，不知道是不是哪里有问

题

然后还要请教您下那个dump-task是怎么使用的，我是npl小白，还有很多问题想不通，望您帮忙解惑

PTPCG预测时灵活性的问题

hi，看了PTPCG这个模型，对预测过程有个问题。假设邻接矩阵已经预测出来，那么对应的Combinations也就确定了，下面就要根据每一个预测出的事件类型和Combinations里的每一个Combination进行论元角色预测。这样是不是有一个潜在的假设：每一个预测出的事件类型都有同样数量的Combination，即每个event_type都有同样数量的event_object。不知道我的理解是否有误？

请问有计划将DEPPN集成进来吗？

很感谢作者对DocEE的贡献～，ACL2021中除了GIT之外，还有一篇DEPPN效果也还不错，并且是使用的并行解码的策略，会快不少。它的代码也是继承自Doc2EDAG，DEPPN：https://github.com/HangYang-NLP/DE-PPN
不知道作者是否有计划将其集成到目前这个toolkit～

请问训练duee-fin大概需要的算力规模和时间是多少呢？

运行了老师您提供的模型参数，应该是在ChinFIn上训练的吧。我想问一下现在想在duee-fin上训练，但是我的硬件设备有限，请问大概需要多长时间呢？

想问问大佬投了什么顶会

** Idea sharing **
While sharing what you want to do, make sure to protect your ideas.

** Problems **
If you have any questions about event extraction, make sure you have read the latest papers or searched on the Internet.

** Others **
Other things you may want to share or discuss.

Would it be possible to use my own entities?

I have my own entities extracted through another model. How easy would it be to apply DEE to these external entities using this work? I see that the entities and events share an lstm at the beginning of processing.

训练PTPCG时输出的评估json文件里有几个命名不确定具体的意思

您好👋，想确定一下我对下面这几个变量名理解是否正确

instance是指event_object，adj_mat是指邻接矩阵，connection是指所有实体之间的边，trigger是指从邻接矩阵找到的极大团，rawCombination是指从邻接矩阵分离的Combination。

Reproduction of Doc2EDAG

** Idea sharing **
While sharing what you want to do, make sure to protect your ideas.

** Problems **
If you have any questions about event extraction, make sure you have read the latest papers or searched on the Internet.

** Others **
Other things you may want to share or discuss.
Hello, Spico! I'm very glad to talk with you about event extraction. Does the order of event type (o2o, o2m, m2m) in training data important for model performance? I find that the reproduction of Doc2EDAG in your paper is (P=86.2, R=70.8, F=79.0, overall scores), but my reproduction is only (P=79.7, R=73.2, F=76.3, overall scores). I just git clone code from the Github repo in Doc2EDAG paper and run the code without modified data preprocessing.

如何安装dee模块

伪触发词

您好，在看代码的时候发现zheng2019_trigger_graph.py中伪触发词选择的是测试集中重要性最高的，请问这是否是合理的呢

结果中Average字段是否对应论文中的Macro的结果？

Hi, 可以把图中Average的结果理解为对五种事件类型对应指标的平均吗，即论文中Macro的结果？

模型训练问题

Agreement

Fill the space in brackets with x to check the agreement items.
Before submitting this issue, I've fully checked the instructions in README.md.
Before submitting this issue, I'd searched in the issue area and didn't find a solved issue that covers my problem.
This issue is about the toolkit itself, not Python, pip or other programming basics.
I understand if I do not check all the agreemnt items above, my issue MAY BE CLOSED OR REMOVED WITHOUT FURTHER EXPLANATIONS.

Problem

老师你好，我现在想重新训练ptpcg模型，运行run_ptpcg.sh发现我的电脑配置太低，所以准备申请云平台进行加速。我阅读了dee_task.py，现在我是否通过shell运行run_dee_task.py，就可以获得我想要的模型在Exps文件中？（不知道为啥，dee_task.train(save_cpt_flag=in_argv.save_cpt_flag)中的save_cpt_flag=False，意思是不保存模型吗？）

触发词的问题

我又来了，还是有问题想请问下：

按照论文里面的pipeline 只有单触法词的模型训练（非伪触法词），触法词识别是先ner 然后作为图构建的节点
在构建子图分解的时候这个触法词节点是作为最大子团来的吗？
2.代码里面如何判断那些mention是伪触法词(或者触法词) 需要在span_context_list 里面获取对应的下标

uncleaned redundancies

LSTMMTL2EDAGModel
EventTableForIndependentTypeCombination
DEEMultiStepTriggeringFeatureConverter
DEEMultiStepTriggeringFeature

关于predict_one（）函数的输入问题

老师您好，感谢您的指导，又来打扰您了。今天我完成了初始化工作后，使用predict_one（）函数进行测试。我使用了两组纯文本数据，第一组是字母乱码，第二组是金融领域的文本。第一组有输出，第二组直接卡死，这是什么原因呢？我是忽视了什么参数吗？（我使用了Google的bert模型https://storage.googleapis.com/bert_models/2018_11_03/chinese_L-12_H-768_A-12.zip

）

EE with triggers

do you have any materials or repo work for the doc EE with triggers ? thanks.

Readme first before opening a new issue when error occurs. 遇到报错提issue之前先看这里

For toolkit usage errors, you must strictly follow the Toolkit usage issue template to open a new issue.
对于使用时报错等工具使用类的问题，必须严格使用 Toolkit usage issue 模板进行提问。

Otherwise, your issue may be closed directly without further explanations.
否则您的 issue 可能会被无解释地直接关闭。

The template can be found when you open a new issue.
该模板可在新建 issue 时找到。

where to download bert-base-chinese

Agreement

Fill the space in brackets with x to check the agreement items.
Before submitting this issue, I've fully checked the instructions in README.md.
Before submitting this issue, I'd searched in the issue area and didn't find a solved issue that covers my problem.
This issue is about the toolkit itself, not Python, pip or other programming basics.
I understand if I do not check all the agreemnt items above, my issue MAY BE CLOSED OR REMOVED WITHOUT FURTHER EXPLANATIONS.

Hi, I am trying to train PTPCG. I download bert-base-chinese from huggingface website but error occurs. So I want to get your bert model to reduce the probability of errors. Thanks for reading.

大佬的天干地支随机种子选择法很好用，这篇事件抽取看的有点似懂非懂，甚至不知道怎么提问，大佬日后如果有线上分享希望能发一下

新数据集的训练

Agreement

Fill the space in brackets with x to check the agreement items.
Before submitting this issue, I've fully checked the instructions in README.md.
Before submitting this issue, I'd searched in the issue area and didn't find a solved issue that covers my problem.
This issue is about the toolkit itself, not Python, pip or other programming basics.
I understand if I do not check all the agreemnt items above, my issue MAY BE CLOSED OR REMOVED WITHOUT FURTHER EXPLANATIONS.

Problem

在自己新数据的训练数据处理这块如何入手有无具体的步骤指引

Environment

Environment	Values
System	Windows/Linux
GPU Device
CUDA Version
Python Version
PyTorch Version
dee (the Toolkit) Version

Take Model as API to Extract event in Document

您好，我是做其他的NLP任务的，但是对抽取文档里的Event很感兴趣，发现了您的工作

通读了README之后看到了很详细的复现方法，但是想问一下是否有公开已经训练的模型，以及inference的API。可以比较方便的直接作为一个数据的预处理方法，在自己的数据上，获得文档中的事件，而不需要重新训练和阅读代码呢？

非常感谢您的建议

PTPCG 分布式训练的效率

** Idea sharing **
While sharing what you want to do, make sure to protect your ideas.

** Problems **
参考了其他运行的命令执行如下命令

TASK_NAME='PTPCG_R1_reproduction'
CUDA='0,1,2,3'
NUM_GPU=4
MODEL_NAME='TriggerAwarePrunedCompleteGraph'


CUDA_VISIBLE_DEVICES=${CUDA} ./scripts/train_multi.sh ${NUM_GPU} --task_name ${TASK_NAME}\
    --use_bert=False \
    --bert_model='/data/xxl/roberta-base-chinese/' \
    --model_type=${MODEL_NAME} \
    --cpt_file_name=${MODEL_NAME} \
    --resume_latest_cpt=False \
	--save_cpt_flag=False \
    --save_best_cpt=True \
    --remove_last_cpt=True \
    --resume_latest_cpt=False \
    --optimizer='adam' \
    --learning_rate=0.0005 \
    --dropout=0.1 \
    --gradient_accumulation_steps=8 \
    --train_batch_size=64 \
    --eval_batch_size=16 \
    --max_clique_decode=True \
    --num_triggers=1 \
    --eval_num_triggers=1 \
    --with_left_trigger=True \
    --directed_trigger_graph=True \
    --use_scheduled_sampling=True \
    --schedule_epoch_start=10 \
    --schedule_epoch_length=10 \
    --num_train_epochs=100 \
    --run_mode='full' \
    --skip_train=False \
	--filtered_data_types='o2o,o2m,m2m' \
    --re_eval_flag=False \
    --add_greedy_dec=False \
    --num_lstm_layers=2 \
    --hidden_size=768 \
    --biaffine_hidden_size=512 \
    --biaffine_hard_threshold=0.5 \
    --at_least_one_comb=True \
    --include_complementary_ents=True \
    --event_type_template='zheng2019_trigger_graph' \
    --use_span_lstm=True \
    --span_lstm_num_layer=2 \
    --role_by_encoding=True \
    --use_token_role=True \
    --ment_feature_type='concat' \
    --ment_type_hidden_size=32 \
    --parallel_decorate

运行的几个卡我看都是有使用起来的

但是最终的运行速度还是没有提高（20min），比较单卡的时间还要长一些。这块我也不是很懂是不是缺少啥

一些问题

Agreement

Fill the space in brackets with x to check the agreement items.
Before submitting this issue, I've fully checked the instructions in README.md.
Before submitting this issue, I'd searched in the issue area and didn't find a solved issue that covers my problem.
This issue is about the toolkit itself, not Python, pip or other programming basics.
I understand if I do not check all the agreemnt items above, my issue MAY BE CLOSED OR REMOVED WITHOUT FURTHER EXPLANATIONS.

Problem

My problem is ...
老师你好，我是这几天才正式用github的，您说的严格按照模板我不知道是不是指我这个，如果有问题，还望您海涵。
我这里还有几个问题请教您（还要这么麻烦您我还真有点不好意思了~）
1.inteference.py里的bert-base-chinese是不是图一里的那个，这里面还有哪些文件是必须要的呀（vocab.txt当然是需要的）
2.图二里您说的过一遍是指执行一遍程序吗，因为我执行了interference.py出现了图三的结果，我不知道是哪里导致出现这种问题，所以才问您这个简单的问题的~
3.如果是运行的话您能大致说下执行流程吗，我看您readmd看的有点迷糊，我这里是想用您的模型执行一些预测

You can reproduce the problem by ...

I have tried ..., but it goes to ...

I have checked the source codes, and the problem may come from ...

Environment

Environment	Values
System	Windows/Linux
GPU Device
CUDA Version
Python Version
PyTorch Version
dee (the Toolkit) Version

Full Log

Log:

预测结果出错

doc_lang=self.setting.doc_lang报错

在运行run_ptpcg_dueefin_withtgg_withptgg.sh时，报错 self.setting没有doc_lang这个值，然后我手动加了一下（俺也不知道对不对，添加的值是self.setting.doc_lang=‘zh’）

关于使用问题

你好我是使用windows的学生，拜读您的论文后十分感兴趣，看了之前您关于api问题的回答，但是还是不知道下载模型后如何在我的项目中使用，您方便的话可以赐教一下嘛

训练 teacher prob 的问题

看了下模型训练最后还有0.1的概率这样模型最终训练的时候其实应该只有 use_gold_span 为True的时候指标才会上升其他的时候基本都是不会增长的那这个在dev上面的效果还有参考意义吗

问1：LSTMMTL、LSTMMTL2CompleteGraph区别？问2：调整readme对应参数和batchsize为16后，运行LSTMMTL，仍报out of memory，我是1块2080ti，11G，70m自己数据，需求一份论文所说9G卡可运行参数，谢谢

Using backend: pytorch
2022-02-23 15:51:43.263 2022-02-23 15:51:43.264 "data_dir": "./Data",
"model_dir": "./Exps/jiao/Model",
"output_dir": "./Exps/jiao/Output",
"bert_model": "bert",
"train_file_name": "typed_trai "dev_file_name": "typed_dev.json",
"test_file_name": "typed_test. "max_seq_len": 128,
"train_batch_size": 16,
"eval_batch_size": 2,
"learning_rate": 0.0001,
"num_train_epochs": 10,
"warmup_proportion": 0.1,
"no_cuda": false,
"local_rank": -1,
"seed": 99,
"gradient_accumulation_steps": 8,
"optimize_on_cpu": false,
"fp16": false,
"loss_scale": 128,
"cpt_file_name": "Doc2EDAG",
"summary_dir_name": "./Exps/ji "event_type_template": "jiao",
"max_sent_len": 128,
"max_sent_num": 64,
"use_lr_scheduler": false,
"lr_scheduler_step": 20,
"use_bert": false,
"use_biaffine_ner": false,
"use_masked_crf": false,
"only_master_logging": true,
"resume_latest_cpt": true,
"remove_last_cpt": false,
"save_best_cpt": false,
"model_type": "Doc2EDAG",
"rearrange_sent": false,
"use_crf_layer": true,
"min_teacher_prob": 0.1,
"schedule_epoch_start": 10,
"schedule_epoch_length": 10,
"loss_lambda": 0.05,
"loss_gamma": 1.0,
"add_greedy_dec": true,
"use_token_role": true,
"seq_reduce_type": "MaxPooling",
"hidden_size": 768,
"dropout": 0.1,
"ff_size": 1024,
"num_tf_layers": 4,
"use_path_mem": true,
"use_scheduled_sampling": true,
"use_doc_enc": true,
"neg_field_loss_scaling": 3.0,
"gcn_layer": 3,
"ner_num_tf_layers": 4,
"num_lstm_layers": 1,
"use_span_lstm": false,
"span_lstm_num_layer": 1,
"use_span_att": false,
"span_att_heads": 4,
"dot_att_head": 4,
"comb_samp_min_num_span": 2,
"comb_samp_num_samp": 100,
"comb_samp_max_samp_times": 1000,
"use_span_lstm_projection": false,
"biaffine_hidden_size": 256,
"triaffine_hidden_size": 150,
"vi_max_iter": 3,
"biaffine_hard_threshold": 0.5,
"event_cls_loss_weight": 1.0,
"smooth_attn_loss_weight": 1.0,
"combination_loss_weight": 1.0,
"comb_cls_loss_weight": 1.0,
"comb_sim_loss_weight": 1.0,
"span_cls_loss_weight": 1.0,
"use_comb_cls_pred": false,
"role_loss_weight": 1.0,
"event_relevant_combination": "run_mode": "full",
"drop_irr_ents": false,
"at_least_one_comb": true,
"include_complementary_ents": true,
"filtered_data_types": "o2o",
"ent_context_window": 20,
"biaffine_grad_clip": false,
"global_grad_clip": false,
"ent_fix_mode": "n",
"span_mention_sum": false,
"add_adj_mat_weight_bias": false,
"optimizer": "adam",
"num_triggers": 1,
"eval_num_triggers": 1,
"with_left_trigger": true,
"with_all_one_trigger_comb": false,
"directed_trigger_graph": false,
"adj_sim_head": 1,
"adj_sim_agg": "mean",
"adj_sim_split_head": false,
"num_triggering_steps": 1,
"use_shared_dropout_proj": false,
"use_layer_norm_b4_biaffine": "remove_mention_type_layer_norm": "use_token_drop": false,
"guessing_decode": false,
"max_clique_decode": true,
"try_to_make_up": false,
"self_loop": false,
"incremental_min_conn": -1,
"use_span_self_att": false,
"use_smooth_span_self_att": false,
"ment_feature_type": "plus",
"ment_type_hidden_size": 32,
"num_mention_lstm_layer": 1,
"gat_alpha": 0.2,
"gat_num_heads": 4,
"gat_num_layers": 2,
"role_by_encoding": false,
"use_mention_lstm": false,
"mlp_before_adj_measure": false,
"use_field_cls_mlp": false,
"build_dense_connected_doc_graph": "stop_gradient": false,
"doc_lang": "zh"
}
2022-02-23 15:51:43.264 2022-02-23 15:51:43.296 2022-02-23 15:51:43.296 2022-02-23 15:51:43.297 /root/anaconda3/envs/zhtorch/l _np_qint8 = np.dtype([("qint8", /root/anaconda3/envs/zhtorch/l _np_quint8 = np.dtype([("quint8", /root/anaconda3/envs/zhtorch/l _np_qint16 = np.dtype([("qint16", /root/anaconda3/envs/zhtorch/l _np_quint16 = np.dtype([("quint16", /root/anaconda3/envs/zhtorch/l _np_qint32 = np.dtype([("qint32", /root/anaconda3/envs/zhtorch/l np_resource = np.dtype([("resource", 2022-02-23 15:51:44.384 2022-02-23 15:51:44.384 file bert/config.json not found
The tokenizer class you The tokenizer class you The class this function [('Build', ['CompanyName', 2022-02-23 15:51:44.651 | INFO 2022-02-23 15:51:44.651 | INFO 2022-02-23 15:51:44.651 | INFO 2022-02-23 15:51:44.652 | INFO 2022-02-23 15:51:44.652 | INFO 2022-02-23 15:51:44.652 | INFO 2022-02-23 15:51:44.652 | INFO 2022-02-23 15:51:44.652 | INFO 2022-02-23 15:51:44.653 | INFO 2022-02-23 15:51:44.653 | INFO 2022-02-23 15:51:44.653 | INFO 2022-02-23 15:51:44.653 | INFO 2022-02-23 15:51:44.653 | INFO 2022-02-23 15:51:44.653 | INFO 2022-02-23 15:51:44.654 | INFO 2022-02-23 15:51:44.654 | INFO 2022-02-23 15:51:44.654 | INFO 2022-02-23 15:51:44.654 | INFO 2022-02-23 15:51:44.654 | INFO 2022-02-23 15:51:44.654 | INFO 2022-02-23 15:51:44.655 | INFO 2022-02-23 15:51:44.655 | INFO 2022-02-23 15:51:44.655 | INFO 2022-02-23 15:51:44.655 | INFO 2022-02-23 15:51:44.655 | INFO 2022-02-23 15:51:44.655 | INFO 2022-02-23 15:51:44.656 | INFO 2022-02-23 15:51:44.656 | INFO 2022-02-23 15:51:44.656 | INFO 2022-02-23 15:51:44.656 | INFO 2022-02-23 15:51:44.656 | INFO 2022-02-23 15:51:44.656 | INFO 2022-02-23 15:51:44.657 | INFO 2022-02-23 15:51:44.657 | INFO 2022-02-23 15:51:44.657 | INFO 2022-02-23 15:51:44.657 | INFO 2022-02-23 15:51:44.657 | INFO 2022-02-23 15:51:44.657 | INFO 2022-02-23 15:51:44.658 | INFO 2022-02-23 15:51:44.658 | INFO 2022-02-23 15:51:44.658 | INFO 2022-02-23 15:51:44.658 | INFO 2022-02-23 15:51:44.658 | INFO 2022-02-23 15:51:44.658 | INFO 2022-02-23 15:51:44.659 | INFO 2022-02-23 15:51:44.659 | INFO 2022-02-23 15:51:44.659 | INFO 2022-02-23 15:51:44.659 | INFO 2022-02-23 15:51:44.659 | INFO 2022-02-23 15:51:44.660 | INFO 2022-02-23 15:51:44.660 | INFO 2022-02-23 15:51:44.660 | INFO 2022-02-23 15:51:44.660 | INFO 2022-02-23 15:51:44.660 | INFO 2022-02-23 15:51:44.660 | INFO 2022-02-23 15:51:44.661 | INFO 2022-02-23 15:51:44.661 | INFO 2022-02-23 15:51:44.661 | INFO 2022-02-23 15:51:44.661 | INFO 2022-02-23 15:51:44.661 | INFO 2022-02-23 15:51:44.661 | INFO 2022-02-23 15:51:44.662 | INFO 2022-02-23 15:51:44.662 | INFO 2022-02-23 15:51:44.662 | INFO 2022-02-23 15:51:44.662 | INFO 2022-02-23 15:51:44.662 | INFO 2022-02-23 15:51:44.662 | INFO 2022-02-23 15:51:44.663 | INFO 2022-02-23 15:51:44.663 | INFO 2022-02-23 15:51:44.663 | INFO 2022-02-23 15:51:44.663 | INFO 2022-02-23 15:51:44.663 | INFO 2022-02-23 15:51:44.663 | INFO 2022-02-23 15:51:44.664 | INFO 2022-02-23 15:51:44.664 | INFO 2022-02-23 15:51:44.664 | INFO 2022-02-23 15:51:44.664 | INFO 2022-02-23 15:51:44.664 | INFO 2022-02-23 15:51:44.664 | INFO 2022-02-23 15:51:44.665 | INFO 2022-02-23 15:51:44.665 | INFO 2022-02-23 15:51:44.665 | INFO 2022-02-23 15:51:44.665 | INFO 2022-02-23 15:51:44.665 | INFO 2022-02-23 15:51:44.665 | INFO 2022-02-23 15:51:44.666 | INFO 2022-02-23 15:51:44.666 | INFO 2022-02-23 15:51:44.666 | INFO 2022-02-23 15:51:44.666 | INFO 2022-02-23 15:51:44.666 | INFO 2022-02-23 15:51:44.666 | INFO 2022-02-23 15:51:44.667 | INFO 2022-02-23 15:51:44.667 | INFO 2022-02-23 15:51:44.667 | INFO 2022-02-23 15:51:44.667 | INFO 2022-02-23 15:51:44.667 | INFO 2022-02-23 15:51:44.667 | INFO 2022-02-23 15:51:44.668 | INFO 2022-02-23 15:51:44.668 | INFO 2022-02-23 15:51:44.668 | INFO 2022-02-23 15:51:44.668 | INFO 2022-02-23 15:51:44.668 | INFO 2022-02-23 15:51:44.668 | INFO 2022-02-23 15:51:44.669 | INFO 2022-02-23 15:51:44.669 | INFO 2022-02-23 15:51:44.669 | INFO 2022-02-23 15:51:44.669 | INFO 2022-02-23 15:51:44.669 | INFO 2022-02-23 15:51:44.669 | INFO 2022-02-23 15:51:44.670 | INFO 2022-02-23 15:51:44.670 | INFO 2022-02-23 15:51:44.670 | INFO 2022-02-23 15:51:44.670 | INFO 2022-02-23 15:51:44.670 | INFO 2022-02-23 15:51:44.670 | INFO 2022-02-23 15:51:44.671 | INFO 2022-02-23 15:51:44.671 | INFO 2022-02-23 15:51:44.671 | INFO 2022-02-23 15:51:44.671 | INFO 2022-02-23 15:51:44.671 | INFO 2022-02-23 15:51:44.671 | INFO 2022-02-23 15:51:44.672 | INFO 2022-02-23 15:51:44.672 | INFO 2022-02-23 15:51:44.672 | INFO 2022-02-23 15:51:44.672 | INFO 2022-02-23 15:51:44.672 | INFO 2022-02-23 15:51:44.673 | INFO 2022-02-23 15:51:44.673 | INFO 2022-02-23 15:51:44.673 | INFO 2022-02-23 15:51:44.673 | INFO 2022-02-23 15:51:44.673 | INFO 2022-02-23 15:51:44.673 | INFO 2022-02-23 15:51:44.674 | INFO 2022-02-23 15:51:44.674 | INFO 2022-02-23 15:51:44.674 | INFO 2022-02-23 15:51:44.674 | INFO 2022-02-23 15:51:44.674 | INFO 2022-02-23 15:51:44.674 | INFO 2022-02-23 15:51:44.675 | INFO 2022-02-23 15:51:44.675 | INFO 2022-02-23 15:51:44.675 | INFO 2022-02-23 15:51:44.675 | INFO 2022-02-23 15:51:44.675 | INFO 2022-02-23 15:51:44.675 | INFO 2022-02-23 15:51:44.676 | INFO 2022-02-23 15:51:44.676 | INFO 2022-02-23 15:51:44.676 | INFO 2022-02-23 15:51:44.676 | INFO 2022-02-23 15:51:44.676 | INFO 2022-02-23 15:51:44.676 | INFO 2022-02-23 15:51:44.677 | INFO 2022-02-23 15:51:44.677 | INFO 2022-02-23 15:51:44.677 | INFO 2022-02-23 15:51:44.677 | INFO 2022-02-23 15:51:44.677 | INFO 2022-02-23 15:51:44.677 | INFO 2022-02-23 15:51:44.678 | INFO 2022-02-23 15:51:44.678 | INFO 2022-02-23 15:51:44.678 | INFO 2022-02-23 15:51:44.678 | INFO 2022-02-23 15:51:44.678 | INFO 2022-02-23 15:51:44.678 | INFO 2022-02-23 15:51:44.679 | INFO 2022-02-23 15:51:44.679 | INFO 2022-02-23 15:51:44.679 | INFO 2022-02-23 15:51:44.679 | INFO 2022-02-23 15:51:44.679 | INFO 2022-02-23 15:51:44.679 | INFO 2022-02-23 15:51:44.680 | INFO 2022-02-23 15:51:44.680 | INFO 2022-02-23 15:51:44.680 | INFO 2022-02-23 15:51:44.680 | INFO 2022-02-23 15:51:44.680 | INFO 2022-02-23 15:51:44.680 | INFO 2022-02-23 15:51:44.681 | INFO 2022-02-23 15:51:44.681 | INFO 2022-02-23 15:51:44.681 | INFO 2022-02-23 15:51:44.681 | INFO 2022-02-23 15:51:44.681 | INFO 2022-02-23 15:51:44.681 | INFO 2022-02-23 15:51:44.682 | INFO 2022-02-23 15:51:44.682 | INFO 2022-02-23 15:51:44.682 | INFO 2022-02-23 15:51:44.682 | INFO 2022-02-23 15:51:44.682 | INFO 2022-02-23 15:51:44.682 | INFO 2022-02-23 15:51:44.683 | INFO 2022-02-23 15:51:44.683 | INFO 2022-02-23 15:51:44.683 | INFO 2022-02-23 15:51:44.683 | INFO 2022-02-23 15:51:44.683 | INFO 2022-02-23 15:51:44.683 | INFO 2022-02-23 15:51:44.684 | INFO 2022-02-23 15:51:44.684 | INFO 2022-02-23 15:51:44.684 | INFO 2022-02-23 15:51:44.684 | INFO 2022-02-23 15:51:44.684 | INFO 2022-02-23 15:51:44.684 | INFO 2022-02-23 15:51:44.685 | INFO 2022-02-23 15:51:44.685 | INFO 2022-02-23 15:51:44.685 | INFO 2022-02-23 15:51:44.685 | INFO 2022-02-23 15:51:44.685 | INFO 2022-02-23 15:51:44.686 | INFO 2022-02-23 15:51:44.686 | INFO 2022-02-23 15:51:44.686 | INFO 2022-02-23 15:51:44.686 | INFO 2022-02-23 15:51:44.686 | INFO 2022-02-23 15:51:44.686 | INFO 2022-02-23 15:51:44.687 | INFO 2022-02-23 15:51:44.687 | INFO 2022-02-23 15:51:44.687 | INFO 2022-02-23 15:51:44.687 | INFO 2022-02-23 15:51:44.687 | INFO 2022-02-23 15:51:44.687 | INFO 2022-02-23 15:51:44.688 | INFO 2022-02-23 15:51:44.688 | INFO 2022-02-23 15:51:44.688 | INFO 2022-02-23 15:51:44.688 | INFO 2022-02-23 15:51:44.688 | INFO 2022-02-23 15:51:44.688 | INFO 2022-02-23 15:51:44.689 | INFO 2022-02-23 15:51:44.689 | INFO 2022-02-23 15:51:44.689 | INFO 2022-02-23 15:51:44.689 | INFO 2022-02-23 15:51:44.689 | INFO 2022-02-23 15:51:44.689 | INFO 2022-02-23 15:51:44.690 | INFO 2022-02-23 15:51:44.690 | INFO 2022-02-23 15:51:44.690 | INFO 2022-02-23 15:51:44.690 | INFO 2022-02-23 15:51:44.690 | INFO 2022-02-23 15:51:44.690 | INFO 2022-02-23 15:51:44.691 | INFO 2022-02-23 15:51:44.691 | INFO 2022-02-23 15:51:44.691 | INFO 2022-02-23 15:51:44.691 | INFO 2022-02-23 15:51:44.691 | INFO 2022-02-23 15:51:44.691 | INFO 2022-02-23 15:51:44.692 | INFO 2022-02-23 15:51:44.692 | INFO 2022-02-23 15:51:44.692 | INFO 2022-02-23 15:51:44.692 | INFO 2022-02-23 15:51:44.692 | INFO 2022-02-23 15:51:44.692 | INFO 2022-02-23 15:51:44.693 | INFO 2022-02-23 15:51:44.693 | INFO 2022-02-23 15:51:44.693 | INFO 2022-02-23 15:51:44.693 | INFO 2022-02-23 15:51:44.693 Traceback (most recent call last):
File "/home/jiaojiaxin/DocEE/r parallel_decorate=in_argv.para File "/home/jiaojiaxin/DocEE/d self._decorate_model(parallel_ File "/home/jiaojiaxin/DocEE/d self.model.to(self.device)
File "/root/anaconda3/envs/zht return self._apply(convert)
File "/root/anaconda3/envs/zht module._apply(fn)
File "/root/anaconda3/envs/zht module._apply(fn)
File "/root/anaconda3/envs/zht module._apply(fn)
File "/root/anaconda3/envs/zht param_applied = fn(param)
File "/root/anaconda3/envs/zht return t.to(device, dtype RuntimeError: CUDA error: out of memory | Level 20 | dee.tasks.base_task:logging:196 - ====================Check Setting Validity====================
| Level 20 | dee.tasks.base_task:logging:196 - Setting: {
n.json",
json",
ao/Summary/Summary",
false,
false,
false,
false,
| Level 20 | dee.tasks.base_task:logging:196 - ====================Init Device====================
| Level 20 | dee.tasks.base_task:logging:196 - device cuda n_gpu 2 distributed training False
| Level 20 | dee.tasks.base_task:logging:196 - ====================Reset Random Seed to 99====================
| Level 20 | dee.tasks.base_task:logging:196 - Init Summary Writer
ib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np.int8, 1)])
ib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np.uint8, 1)])
ib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np.int16, 1)])
ib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np.uint16, 1)])
ib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np.int32, 1)])
ib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np.ubyte, 1)])
| Level 20 | dee.tasks.base_task:logging:196 - Writing summary into ./Exps/jiao/Summary/Summary-Feb23_15-51-43
| Level 20 | dee.tasks.base_task:logging:196 - Initializing DEETask
load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
load from this checkpoint is 'BertTokenizer'.
is called from is 'BertTokenizerForDocEE'.
'Product', 'Address', 'StartTime', 'Country'], {1: ['CompanyName'], 2: ['CompanyName', 'StartTime'], 3: ['CompanyName', 'Product', 'StartTime'], 4: ['Address', 'CompanyName', 'Product', 'StartTime'], 5: ['Address', 'CompanyName', 'Country', 'Product', 'StartTime'], 'all': ['CompanyName', 'Product', 'Address', 'StartTime', 'Country']}, 5), ('Violated', ['CompanyName', 'Law', 'StartTime', 'Address', 'Character'], {1: ['CompanyName'], 2: ['CompanyName', 'StartTime'], 3: ['Character', 'CompanyName', 'StartTime'], 4: ['Address', 'Character', 'CompanyName', 'StartTime'], 5: ['Address', 'Character', 'CompanyName', 'Law', 'StartTime'], 'all': ['CompanyName', 'Law', 'StartTime', 'Address', 'Character']}, 5)]
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_embedding.token_embedding.weight torch.Size([21128, 768]) 16226304
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_embedding.pos_embedding.weight torch.Size([128, 768]) 98304
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_embedding.layer_norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_embedding.layer_norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.self_attn.linears.0.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.self_attn.linears.0.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.self_attn.linears.1.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.self_attn.linears.1.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.self_attn.linears.2.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.self_attn.linears.2.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.self_attn.linears.3.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.self_attn.linears.3.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.feed_forward.w_1.bias torch.Size([1024]) 1024
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.feed_forward.w_2.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.sublayer.0.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.sublayer.0.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.sublayer.1.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.sublayer.1.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.self_attn.linears.0.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.self_attn.linears.0.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.self_attn.linears.1.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.self_attn.linears.1.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.self_attn.linears.2.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.self_attn.linears.2.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.self_attn.linears.3.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.self_attn.linears.3.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.feed_forward.w_1.bias torch.Size([1024]) 1024
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.feed_forward.w_2.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.sublayer.0.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.sublayer.0.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.sublayer.1.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.sublayer.1.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.self_attn.linears.0.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.self_attn.linears.0.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.self_attn.linears.1.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.self_attn.linears.1.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.self_attn.linears.2.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.self_attn.linears.2.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.self_attn.linears.3.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.self_attn.linears.3.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.feed_forward.w_1.bias torch.Size([1024]) 1024
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.feed_forward.w_2.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.sublayer.0.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.sublayer.0.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.sublayer.1.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.sublayer.1.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.self_attn.linears.0.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.self_attn.linears.0.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.self_attn.linears.1.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.self_attn.linears.1.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.self_attn.linears.2.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.self_attn.linears.2.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.self_attn.linears.3.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.self_attn.linears.3.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.feed_forward.w_1.bias torch.Size([1024]) 1024
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.feed_forward.w_2.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.sublayer.0.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.sublayer.0.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.sublayer.1.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.sublayer.1.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.norm.betatorch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ner_model.crf_layer.trans_mat torch.Size([17, 17]) 289
| dee.tasks.dee_task:init:377 - Trainable: ner_model.crf_layer.hidden2tag.weight torch.Size([17, 768]) 13056
| dee.tasks.dee_task:init:377 - Trainable: ner_model.crf_layer.hidden2tag.bias torch.Size([17]) 17
| dee.tasks.dee_task:init:377 - Trainable: event_tables.0.event_query torch.Size([1, 768]) 768
| dee.tasks.dee_task:init:377 - Trainable: event_tables.0.event_cls.weight torch.Size([2, 768]) 1536
| dee.tasks.dee_task:init:377 - Trainable: event_tables.0.event_cls.bias torch.Size([2]) 2
| dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.0.weight torch.Size([2, 768]) 1536
| dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.0.bias torch.Size([2]) 2
| dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.1.weight torch.Size([2, 768]) 1536
| dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.1.bias torch.Size([2]) 2
| dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.2.weight torch.Size([2, 768]) 1536
| dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.2.bias torch.Size([2]) 2
| dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.3.weight torch.Size([2, 768]) 1536
| dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.3.bias torch.Size([2]) 2
| dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.4.weight torch.Size([2, 768]) 1536
| dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.4.bias torch.Size([2]) 2
| dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_queries.0 torch.Size([1, 768]) 768
| dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_queries.1 torch.Size([1, 768]) 768
| dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_queries.2 torch.Size([1, 768]) 768
| dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_queries.3 torch.Size([1, 768]) 768
| dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_queries.4 torch.Size([1, 768]) 768
| dee.tasks.dee_task:init:377 - Trainable: event_tables.1.event_query torch.Size([1, 768]) 768
| dee.tasks.dee_task:init:377 - Trainable: event_tables.1.event_cls.weight torch.Size([2, 768]) 1536
| dee.tasks.dee_task:init:377 - Trainable: event_tables.1.event_cls.bias torch.Size([2]) 2
| dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.0.weight torch.Size([2, 768]) 1536
| dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.0.bias torch.Size([2]) 2
| dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.1.weight torch.Size([2, 768]) 1536
| dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.1.bias torch.Size([2]) 2
| dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.2.weight torch.Size([2, 768]) 1536
| dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.2.bias torch.Size([2]) 2
| dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.3.weight torch.Size([2, 768]) 1536
| dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.3.bias torch.Size([2]) 2
| dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.4.weight torch.Size([2, 768]) 1536
| dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.4.bias torch.Size([2]) 2
| dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_queries.0 torch.Size([1, 768]) 768
| dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_queries.1 torch.Size([1, 768]) 768
| dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_queries.2 torch.Size([1, 768]) 768
| dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_queries.3 torch.Size([1, 768]) 768
| dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_queries.4 torch.Size([1, 768]) 768
| dee.tasks.dee_task:init:377 - Trainable: sent_pos_encoder.embedding.weighttorch.Size([64, 768]) 49152
| dee.tasks.dee_task:init:377 - Trainable: sent_pos_encoder.layer_norm.gammatorch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: sent_pos_encoder.layer_norm.betatorch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ment_type_encoder.embedding.weight torch.Size([15, 768]) 11520
| dee.tasks.dee_task:init:377 - Trainable: ment_type_encoder.layer_norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: ment_type_encoder.layer_norm.betatorch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.self_attn.linears.0.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.self_attn.linears.0.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.self_attn.linears.1.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.self_attn.linears.1.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.self_attn.linears.2.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.self_attn.linears.2.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.self_attn.linears.3.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.self_attn.linears.3.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.feed_forward.w_1.bias torch.Size([1024]) 1024
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.feed_forward.w_2.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.sublayer.0.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.sublayer.0.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.sublayer.1.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.sublayer.1.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.self_attn.linears.0.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.self_attn.linears.0.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.self_attn.linears.1.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.self_attn.linears.1.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.self_attn.linears.2.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.self_attn.linears.2.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.self_attn.linears.3.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.self_attn.linears.3.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.feed_forward.w_1.bias torch.Size([1024]) 1024
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.feed_forward.w_2.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.sublayer.0.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.sublayer.0.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.sublayer.1.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.sublayer.1.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.self_attn.linears.0.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.self_attn.linears.0.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.self_attn.linears.1.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.self_attn.linears.1.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.self_attn.linears.2.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.self_attn.linears.2.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.self_attn.linears.3.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.self_attn.linears.3.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.feed_forward.w_1.bias torch.Size([1024]) 1024
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.feed_forward.w_2.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.sublayer.0.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.sublayer.0.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.sublayer.1.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.sublayer.1.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.self_attn.linears.0.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.self_attn.linears.0.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.self_attn.linears.1.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.self_attn.linears.1.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.self_attn.linears.2.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.self_attn.linears.2.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.self_attn.linears.3.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.self_attn.linears.3.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.feed_forward.w_1.bias torch.Size([1024]) 1024
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.feed_forward.w_2.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.sublayer.0.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.sublayer.0.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.sublayer.1.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.sublayer.1.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.self_attn.linears.0.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.self_attn.linears.0.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.self_attn.linears.1.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.self_attn.linears.1.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.self_attn.linears.2.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.self_attn.linears.2.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.self_attn.linears.3.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.self_attn.linears.3.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.feed_forward.w_1.bias torch.Size([1024]) 1024
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.feed_forward.w_2.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.sublayer.0.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.sublayer.0.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.sublayer.1.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.sublayer.1.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.self_attn.linears.0.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.self_attn.linears.0.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.self_attn.linears.1.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.self_attn.linears.1.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.self_attn.linears.2.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.self_attn.linears.2.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.self_attn.linears.3.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.self_attn.linears.3.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.feed_forward.w_1.bias torch.Size([1024]) 1024
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.feed_forward.w_2.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.sublayer.0.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.sublayer.0.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.sublayer.1.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.sublayer.1.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.self_attn.linears.0.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.self_attn.linears.0.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.self_attn.linears.1.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.self_attn.linears.1.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.self_attn.linears.2.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.self_attn.linears.2.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.self_attn.linears.3.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.self_attn.linears.3.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.feed_forward.w_1.bias torch.Size([1024]) 1024
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.feed_forward.w_2.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.sublayer.0.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.sublayer.0.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.sublayer.1.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.sublayer.1.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.self_attn.linears.0.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.self_attn.linears.0.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.self_attn.linears.1.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.self_attn.linears.1.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.self_attn.linears.2.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.self_attn.linears.2.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.self_attn.linears.3.weight torch.Size([768, 768]) 589824
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.self_attn.linears.3.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.feed_forward.w_1.bias torch.Size([1024]) 1024
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.feed_forward.w_2.bias torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.sublayer.0.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.sublayer.0.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.sublayer.1.norm.gamma torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.sublayer.1.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.norm.gammatorch.Size([768]) 768
| dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.norm.beta torch.Size([768]) 768
| dee.tasks.dee_task:init:389 - #Total Trainable Parameters: 63716682
| dee.tasks.dee_task:init:390 - #Total Fixed Parameters: 0
| Level 20 | dee.tasks.base_task:logging:196 - ====================Decorate Model====================
un_dee_task.py", line 208, in
llel_decorate,
ee/tasks/dee_task.py", line 392, in init
decorate=parallel_decorate)
ee/tasks/base_task.py", line 474, in _decorate_model
orch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 612, in to
orch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 359, in _apply
orch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 359, in _apply
orch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 359, in _apply
orch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 381, in _apply
orch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 610, in convert
if t.is_floating_point() else None, non_blocking)

您好关于GIT的GCN输出那一段如何理解

您好老师，我在看这几个模型的时候，有一些疑问，请问GIT在做ner的时候，是使用的几层Transformer呢？
构成异构图的时候，GCN是输出的是什么呢？包括后续事件检测时，使用的是什么来代表文档向量进行事件分类的呢？
看论文和代码的时候有些没看懂，希望得到老师您的解答。
谢谢🙏

关于filter_event_type的unk参数，以及寻求一些建议

Agreement

Fill the space in brackets with x to check the agreement items.
[ x] Before submitting this issue, I've fully checked the instructions in README.md.
[x ] Before submitting this issue, I'd searched in the issue area and didn't find a solved issue that covers my problem.
[x ] This issue is about the toolkit itself, not Python, pip or other programming basics.
[x ] I understand if I do not check all the agreemnt items above, my issue MAY BE CLOSED OR REMOVED WITHOUT FURTHER EXPLANATIONS.

Problem

您好，我的问题1：:在最新的inference.py里我发现filter_event_type里出现了个unk，这个是在什么情况使用，我记得之前只支持o2o,o2m,m2m.overall
问题2：您对于开放域的中文新闻中的事件检测模型有什么了解或者推荐么？由于新闻的表达多样性和数据分布，无法直接使用您的模型预测。仍然会预测出大量错误的事件（不含事件的、其他语义类似事件但未包括在已定义事件错分为定义的事件，所以需要在前置加入事件检测部分。
我目前使用了触发词触发事件（保证不筛入完全不含事件的情况，但可能产生大量误召回以及筛掉含部分未收录触发词的已定义事件）+后续根据您的模型抽取结果和文章向量卡阈值的方法（筛出低置信度的结果，保证已抽出结果准确性），但依旧没能解决

Environment

Environment	Values
System	Windows/Linux
GPU Device
CUDA Version
Python Version
PyTorch Version
dee (the Toolkit) Version

Full Log

Log:

关于trigger

** Idea sharing **
While sharing what you want to do, make sure to protect your ideas.

** Problems **
If you have any questions about event extraction, make sure you have read the latest papers or searched on the Internet.
您好，很高心能有您这样的开源工作者，我在实践中遇到了些问题，想向您请教一下。
1、我在自己构造数据集时，发现一个有意思的现象。在标注了trigger的情况下，如下代码两种构建template方式，在build_data.py）中构造数据集时，两种方式下add_triggers都设置为False。得到的结论差距恨到，第二种的F1值高很多，您知道什么原因吗？
2、另外我想问的是TRIGGERS设置那么多模式最终选哪种模式，是根据我们设置的num_triggers参数选择吗？
3、在最终评价的时候有些field对应为null，这种在评价的时候会自动过滤吗？
期待您的回复

class MarryEvent(BaseEvent):
    NAME = "Marry"
    FIELDS = [
        "Trigger",
        'Marry_loc', 'Marry_wife', 'Marry_time', 'Marry_husband'
    ]

    TRIGGERS = {
	1: ['Marry_time'],  # importance: 0.9686967372778184
	2: ['Marry_husband', 'Marry_time'],  # importance: 0.9842342342342343
	3: ['Marry_husband', 'Marry_loc', 'Marry_time'],  # importance: 0.9887387387387387
	4: ['Marry_husband', 'Marry_loc', 'Marry_time', 'Trigger'],  # importance: 0.9887387387387387
	5: ['Marry_husband', 'Marry_loc', 'Marry_time', 'Marry_wife', 'Trigger'],  # importance: 0.9887387387387387
}
    TRIGGERS['all'] = ['Marry_time', 'Marry_loc', 'Marry_husband', 'Marry_wife', 'Trigger']

class MarryEvent(BaseEvent):
    NAME = "Marry"
    FIELDS = ["Trigger",'loc', 'wife', 'time', 'husband']

    TRIGGERS = {
        1: ["Trigger"], 
        2: ["Trigger", 'loc'],  
        3: ["Trigger", 'loc', 'wife'],  
        4: ["Trigger", 'loc', 'wife', 'time'],  
        5: ["Trigger", 'loc', 'wife', 'time', 'husband'], 
    }
    TRIGGERS["all"] = ["Trigger",'loc', 'wife', 'time', 'husband']

    def __init__(self, recguid=None):
        super().__init__(self.FIELDS, event_name=self.NAME, recguid=recguid)
        self.set_key_fields(self.TRIGGERS)

** Others **
Other things you may want to share or discuss.

Whether empty samples can be added for training PTPCG

** Idea sharing **
While sharing what you want to do, make sure to protect your ideas.

** Problems **
If you have any questions about event extraction, make sure you have read the latest papers or searched on the Internet.

** Others **
Other things you may want to share or discuss.

Hello Spico, I'm reproducing your paper and use your online demo. I find a new event will be misclassified. So Can help model classify the event list as null if I add some empty samples. like this:
{text:'I love China',"event_list:"[]

老师好，提示安装dee模块，这个怎么安装呢，pip install -e 不正确，这个如何正确安装呢

DueeFin结果讨论

** Idea sharing **
While sharing what you want to do, make sure to protect your ideas.

** Problems **
嗨，我跟着您的repo 训练完了'PTPCG_P1-DuEE_fin-woTgg-wOtherType'任务（即|R|=1, Tgg=×）.对于Results文件夹的结果有些疑惑，特来请教。
total_results: [ { "ModelType": "TriggerAwarePrunedCompleteGraph", "Total": { "precision": "68.8", "recall": "62.4", "f1": "65.5" }]
请问total_results的f1值是按照下图（图1）的方式计算的吗

其次
"m2m": { "classification": { "precision": "94.696", "recall": "93.718", "f1": "94.204" }, "entity": { "precision": "80.362", "recall": "85.863", "f1": "83.022" }, "combination": { "precision": "22.823", "recall": "24.050", "f1": "23.421" }, "rawCombination": { "precision": "22.448", "recall": "21.756", "f1": "22.097" }, "overall": { "precision": "68.838", "recall": "62.408", "f1": "65.465" }, "instance": { "precision": "21.141", "recall": "22.977", "f1": "22.021" } }
这里的combination的分数是论元的提取分数吗，如果是，则22的f1值是否说明该模型对论元提取任务的参考欠佳。

** Others **
Other things you may want to share or discuss.

训练模型对计算资源的要求

感谢分享。请问卡数的要求和限制（如：Tip: At least 4 * NVIDIA V100 GPU (32GB) cards are required to run GIT models.）是为了训练速度考虑吗，还是对模型训练后的表现也有影响？

降低Batch_size或者增加梯度累计可以用更少的资源来训练（虽然训练时间更长了），不知道作者有没有试过这样做对最终模型效果的影响

关于事件类型

** Problems **
又来打扰您啦，感谢您之前给我的对于标签统计的方法，我已经完成了对__dee/event_types/zheng2019_trigger_graph.py__中标签的统计。但还有一些问题，针对您所提供的 duee-fin 事件类型及对应角色.pdf 中一共有13中事件类型的分类，而在我的统计中只有五种（如图）。请问您是否实现了涵盖所有事件分类的方法？或者说这是您留给读者自己拓展的地方吗？

有关于一些使用上的问题

谢谢您之前对我的指导，现在问题解决了，我翻看了您之前对其它一些同学解答的问题，可谓是比较详细的解答的，但是其中对于刚入门的我还是有点使用上的问题想向您请教，第一个图是您回答的一个issue,我不太理解”使用task dump中配置文件初始化一个dee_task，之后导入已训练好的模型权重“这句话（使用上的流程），期待您的回复

DDP问题 - IndexError: Caught IndexError in replica 0 on device 0

老师您好，在使用单机多卡的时候，会出现以下报错：

Traceback (most recent call last):
File "/data/home/qianbenchen/DocEE-main/dee/tasks/dee_task.py", line 587, in get_loss_on_batch
teacher_prob=teacher_prob,
File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 161, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 171, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
output.reraise()
File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
IndexError: Caught IndexError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/data/home/qianbenchen/DocEE-main/dee/models/trigger_aware.py", line 172, in forward
ent_fix_mode=self.config.ent_fix_mode,
File "/data/home/qianbenchen/DocEE-main/dee/modules/doc_info.py", line 305, in get_doc_arg_rel_info_list
) = get_span_mention_info(span_dranges_list, doc_token_type_mat)
File "/data/home/qianbenchen/DocEE-main/dee/modules/doc_info.py", line 16, in get_span_mention_info
mention_type_list.append(doc_token_type_list[sent_idx][char_s])
IndexError: list index out of range

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "run_dee_task.py", line 274, in
dee_task.train(save_cpt_flag=in_argv.save_cpt_flag)
File "/data/home/qianbenchen/DocEE-main/dee/tasks/dee_task.py", line 656, in train
base_epoch_idx=resume_base_epoch,
File "/data/home/qianbenchen/DocEE-main/dee/tasks/base_task.py", line 693, in base_train
total_loss = get_loss_func(self, batch, **kwargs_dict1)
File "/data/home/qianbenchen/DocEE-main/dee/tasks/dee_task.py", line 598, in get_loss_on_batch
raise Exception("Cannot get the loss")

请问是否有得到解决呢？谢谢！

关于mention type和argument role

Agreement

Fill the space in brackets with x to check the agreement items.
Before submitting this issue, I've fully checked the instructions in README.md.
Before submitting this issue, I'd searched in the issue area and didn't find a solved issue that covers my problem.
This issue is about the toolkit itself, not Python, pip or other programming basics.
I understand if I do not check all the agreemnt items above, my issue MAY BE CLOSED OR REMOVED WITHOUT FURTHER EXPLANATIONS.

Problem

您好，您论文Figure 1中涉及到了两个概念：mention type和argument role，并且在Entity Representation部分也使用到了mention type这一信息。这两者的含义和区别我明白，但是在duee数据集中有mention type这个标注信息吗？我检查了一下，似乎只有argument role。

Environment

Environment	Values
System	Windows/Linux
GPU Device
CUDA Version
Python Version
PyTorch Version
dee (the Toolkit) Version

Full Log

Log:

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.