spico197 / docee Goto Github PK
View Code? Open in Web Editor NEW🕹️ A toolkit for document-level event extraction, containing some SOTA model implementations.
Home Page: https://doc-ee.readthedocs.io/
License: MIT License
🕹️ A toolkit for document-level event extraction, containing some SOTA model implementations.
Home Page: https://doc-ee.readthedocs.io/
License: MIT License
** Problems **
请问为什么在NER模型训练部分输入进模型的ner_token_labels
里面没有论文中提到扩充的Money, Time等实体?
我发现在这里会对entity label 进行in的判断,判断基于的dict来自于 DEEExample。 但是这个list里面没有B-OtherType
和 I-OtherType
.
准备实验个英文数据集 不知道作者是否在wikievents 上面跑出结果 因为看 scripts 里面的预训练模型名称都是中文的 ~~~
x
to check the agreement items.README.md
.我想单独看这个算法的相关的部分内容 不看其他的 是否有历史的分支项目代码
Environment | Values |
---|---|
System | Windows/Linux |
GPU Device | |
CUDA Version | |
Python Version | |
PyTorch Version | |
dee (the Toolkit) Version |
Log:
你好老师,我按照您说的纠正了一些问题,很高兴项目现在已经可以运行了。但是还有一些小问题实在无法解决需要向您请教,下面我将陈述我的问题。
x
to check the agreement items.README.md
.My problem is ...
You can reproduce the problem by ...
老师,我运行程序发现有些实体属性识别不出来(比如公司名识别出来了,但是法院裁定时间识别不出来,而且试了很多文本都出现这种问题),请问这可能是什么问题呢?上面我附带了输出的一些我认为有可能的问题,期待您的解答~
I have tried ..., but it goes to ...
I have checked the source codes, and the problem may come from ...
Environment | Values |
---|---|
System | Windows/Linux |
GPU Device | |
CUDA Version | |
Python Version | |
PyTorch Version | |
dee (the Toolkit) Version |
Log:
作者您好,想问一下readme中提到的event_table.py文件是在那个文件夹中,好像没有找到
x
to check the agreement items.README.md
.When I train the code about dueefin, I can't find dueefin_PTPCG_P1R1_wTgg.json file. My question is what the title says. Thanks for reading! Looking forward to your reply!
Hi,I will use PTPCG for news event extraction. The data format has been processed into PTPCG input format. And I have used trigger.py to get the importance score of psudo trigger selection. Do I need to modify other places besides? like utils.py of dee folder. And what other details should I pay attention to....
Thanks for reading! I am looking forward to your replying!
hi,看了PTPCG这个模型,对预测过程有个问题。假设邻接矩阵已经预测出来,那么对应的Combinations也就确定了,下面就要根据每一个预测出的事件类型和Combinations里的每一个Combination进行论元角色预测。这样是不是有一个潜在的假设:每一个预测出的事件类型都有同样数量的Combination,即每个event_type都有同样数量的event_object。不知道我的理解是否有误?
很感谢作者对DocEE的贡献~,ACL2021中除了GIT之外,还有一篇DEPPN效果也还不错,并且是使用的并行解码的策略,会快不少。它的代码也是继承自Doc2EDAG,DEPPN:https://github.com/HangYang-NLP/DE-PPN
不知道作者是否有计划将其集成到目前这个toolkit~
运行了老师您提供的模型参数 ,应该是在ChinFIn上训练的吧。我想问一下现在想在duee-fin上训练,但是我的硬件设备有限,请问大概需要多长时间呢?
** Idea sharing **
While sharing what you want to do, make sure to protect your ideas.
** Problems **
If you have any questions about event extraction, make sure you have read the latest papers or searched on the Internet.
** Others **
Other things you may want to share or discuss.
I have my own entities extracted through another model. How easy would it be to apply DEE to these external entities using this work? I see that the entities and events share an lstm at the beginning of processing.
** Idea sharing **
While sharing what you want to do, make sure to protect your ideas.
** Problems **
If you have any questions about event extraction, make sure you have read the latest papers or searched on the Internet.
** Others **
Other things you may want to share or discuss.
Hello, Spico! I'm very glad to talk with you about event extraction. Does the order of event type (o2o, o2m, m2m) in training data important for model performance? I find that the reproduction of Doc2EDAG in your paper is (P=86.2, R=70.8, F=79.0, overall scores), but my reproduction is only (P=79.7, R=73.2, F=76.3, overall scores). I just git clone code from the Github repo in Doc2EDAG paper and run the code without modified data preprocessing.
您好,在看代码的时候发现zheng2019_trigger_graph.py中伪触发词选择的是测试集中重要性最高的,请问这是否是合理的呢
x
to check the agreement items.README.md
.老师你好,我现在想重新训练ptpcg模型,运行run_ptpcg.sh发现我的电脑配置太低,所以准备申请云平台进行加速。我阅读了dee_task.py,现在我是否通过shell运行run_dee_task.py,就可以获得我想要的模型在Exps文件中?(不知道为啥,dee_task.train(save_cpt_flag=in_argv.save_cpt_flag)中的save_cpt_flag=False,意思是不保存模型吗?)
我又来了,还是有问题想请问下:
老师您好,感谢您的指导,又来打扰您了。今天我完成了初始化工作后,使用predict_one()函数进行测试。我使用了两组纯文本数据,第一组是字母乱码,第二组是金融领域的文本。第一组有输出,第二组直接卡死,这是什么原因呢?我是忽视了什么参数吗?(我使用了Google的bert模型https://storage.googleapis.com/bert_models/2018_11_03/chinese_L-12_H-768_A-12.zip
)
do you have any materials or repo work for the doc EE with triggers ? thanks.
For toolkit usage errors, you must strictly follow the Toolkit usage
issue template to open a new issue.
对于使用时报错等工具使用类的问题,必须严格使用 Toolkit usage
issue 模板进行提问。
Otherwise, your issue may be closed directly without further explanations.
否则您的 issue 可能会被无解释地直接关闭。
The template can be found when you open a new issue.
该模板可在新建 issue 时找到。
x
to check the agreement items.README.md
.Hi, I am trying to train PTPCG. I download bert-base-chinese from huggingface website but error occurs. So I want to get your bert model to reduce the probability of errors. Thanks for reading.
x
to check the agreement items.README.md
.在自己新数据的训练 数据处理这块如何入手 有无具体的步骤指引
Environment | Values |
---|---|
System | Windows/Linux |
GPU Device | |
CUDA Version | |
Python Version | |
PyTorch Version | |
dee (the Toolkit) Version |
您好,我是做其他的NLP任务的,但是对抽取文档里的Event很感兴趣,发现了您的工作
通读了README之后看到了很详细的复现方法,但是想问一下是否有公开已经训练的模型,以及inference的API。可以比较方便的直接作为一个数据的预处理方法,在自己的数据上,获得文档中的事件,而不需要重新训练和阅读代码呢?
非常感谢您的建议
** Idea sharing **
While sharing what you want to do, make sure to protect your ideas.
** Problems **
参考了其他运行的命令 执行如下命令
TASK_NAME='PTPCG_R1_reproduction'
CUDA='0,1,2,3'
NUM_GPU=4
MODEL_NAME='TriggerAwarePrunedCompleteGraph'
CUDA_VISIBLE_DEVICES=${CUDA} ./scripts/train_multi.sh ${NUM_GPU} --task_name ${TASK_NAME}\
--use_bert=False \
--bert_model='/data/xxl/roberta-base-chinese/' \
--model_type=${MODEL_NAME} \
--cpt_file_name=${MODEL_NAME} \
--resume_latest_cpt=False \
--save_cpt_flag=False \
--save_best_cpt=True \
--remove_last_cpt=True \
--resume_latest_cpt=False \
--optimizer='adam' \
--learning_rate=0.0005 \
--dropout=0.1 \
--gradient_accumulation_steps=8 \
--train_batch_size=64 \
--eval_batch_size=16 \
--max_clique_decode=True \
--num_triggers=1 \
--eval_num_triggers=1 \
--with_left_trigger=True \
--directed_trigger_graph=True \
--use_scheduled_sampling=True \
--schedule_epoch_start=10 \
--schedule_epoch_length=10 \
--num_train_epochs=100 \
--run_mode='full' \
--skip_train=False \
--filtered_data_types='o2o,o2m,m2m' \
--re_eval_flag=False \
--add_greedy_dec=False \
--num_lstm_layers=2 \
--hidden_size=768 \
--biaffine_hidden_size=512 \
--biaffine_hard_threshold=0.5 \
--at_least_one_comb=True \
--include_complementary_ents=True \
--event_type_template='zheng2019_trigger_graph' \
--use_span_lstm=True \
--span_lstm_num_layer=2 \
--role_by_encoding=True \
--use_token_role=True \
--ment_feature_type='concat' \
--ment_type_hidden_size=32 \
--parallel_decorate
运行的几个卡我看都是有使用起来的
但是最终的运行速度还是没有提高(20min) ,比较单卡的时间还要长一些。这块我也不是很懂 是不是缺少啥
x
to check the agreement items.README.md
.My problem is ...
老师你好,我是这几天才正式用github的,您说的严格按照模板我不知道是不是指我这个,如果有问题,还望您海涵。
我这里还有几个问题请教您(还要这么麻烦您我还真有点不好意思了~)
1.inteference.py里的bert-base-chinese是不是图一里的那个,这里面还有哪些文件是必须要的呀(vocab.txt当然是需要的)
2.图二里您说的过一遍是指执行一遍程序吗,因为我执行了interference.py出现了图三的结果,我不知道是哪里导致出现这种问题,所以才问您这个简单的问题的~
3.如果是运行的话您能大致说下执行流程吗,我看您readmd看的有点迷糊,我这里是想用您的模型执行一些预测
You can reproduce the problem by ...
I have tried ..., but it goes to ...
I have checked the source codes, and the problem may come from ...
Environment | Values |
---|---|
System | Windows/Linux |
GPU Device | |
CUDA Version | |
Python Version | |
PyTorch Version | |
dee (the Toolkit) Version |
Log:
在运行run_ptpcg_dueefin_withtgg_withptgg.sh时,报错 self.setting没有doc_lang这个值,然后我手动加了一下(俺也不知道对不对,添加的值是self.setting.doc_lang=‘zh’)
你好我是使用windows的学生,拜读您的论文后十分感兴趣,看了之前您关于api问题的回答,但是还是不知道下载模型后如何在我的项目中使用,您方便的话可以赐教一下嘛
看了下模型训练最后还有0.1的概率 这样模型最终训练的时候其实应该只有 use_gold_span 为True的时候 指标才会上升 其他的时候基本都是不会增长的 那这个在dev上面的效果还有参考意义吗
Using backend: pytorch
2022-02-23 15:51:43.263 | Level 20 | dee.tasks.base_task:logging:196 - ====================Check Setting Validity====================
2022-02-23 15:51:43.264 | Level 20 | dee.tasks.base_task:logging:196 - Setting: {
"data_dir": "./Data",
"model_dir": "./Exps/jiao/Model",
"output_dir": "./Exps/jiao/Output",
"bert_model": "bert",
"train_file_name": "typed_train.json",
"dev_file_name": "typed_dev.json",
"test_file_name": "typed_test.json",
"max_seq_len": 128,
"train_batch_size": 16,
"eval_batch_size": 2,
"learning_rate": 0.0001,
"num_train_epochs": 10,
"warmup_proportion": 0.1,
"no_cuda": false,
"local_rank": -1,
"seed": 99,
"gradient_accumulation_steps": 8,
"optimize_on_cpu": false,
"fp16": false,
"loss_scale": 128,
"cpt_file_name": "Doc2EDAG",
"summary_dir_name": "./Exps/jiao/Summary/Summary",
"event_type_template": "jiao",
"max_sent_len": 128,
"max_sent_num": 64,
"use_lr_scheduler": false,
"lr_scheduler_step": 20,
"use_bert": false,
"use_biaffine_ner": false,
"use_masked_crf": false,
"only_master_logging": true,
"resume_latest_cpt": true,
"remove_last_cpt": false,
"save_best_cpt": false,
"model_type": "Doc2EDAG",
"rearrange_sent": false,
"use_crf_layer": true,
"min_teacher_prob": 0.1,
"schedule_epoch_start": 10,
"schedule_epoch_length": 10,
"loss_lambda": 0.05,
"loss_gamma": 1.0,
"add_greedy_dec": true,
"use_token_role": true,
"seq_reduce_type": "MaxPooling",
"hidden_size": 768,
"dropout": 0.1,
"ff_size": 1024,
"num_tf_layers": 4,
"use_path_mem": true,
"use_scheduled_sampling": true,
"use_doc_enc": true,
"neg_field_loss_scaling": 3.0,
"gcn_layer": 3,
"ner_num_tf_layers": 4,
"num_lstm_layers": 1,
"use_span_lstm": false,
"span_lstm_num_layer": 1,
"use_span_att": false,
"span_att_heads": 4,
"dot_att_head": 4,
"comb_samp_min_num_span": 2,
"comb_samp_num_samp": 100,
"comb_samp_max_samp_times": 1000,
"use_span_lstm_projection": false,
"biaffine_hidden_size": 256,
"triaffine_hidden_size": 150,
"vi_max_iter": 3,
"biaffine_hard_threshold": 0.5,
"event_cls_loss_weight": 1.0,
"smooth_attn_loss_weight": 1.0,
"combination_loss_weight": 1.0,
"comb_cls_loss_weight": 1.0,
"comb_sim_loss_weight": 1.0,
"span_cls_loss_weight": 1.0,
"use_comb_cls_pred": false,
"role_loss_weight": 1.0,
"event_relevant_combination": false,
"run_mode": "full",
"drop_irr_ents": false,
"at_least_one_comb": true,
"include_complementary_ents": true,
"filtered_data_types": "o2o",
"ent_context_window": 20,
"biaffine_grad_clip": false,
"global_grad_clip": false,
"ent_fix_mode": "n",
"span_mention_sum": false,
"add_adj_mat_weight_bias": false,
"optimizer": "adam",
"num_triggers": 1,
"eval_num_triggers": 1,
"with_left_trigger": true,
"with_all_one_trigger_comb": false,
"directed_trigger_graph": false,
"adj_sim_head": 1,
"adj_sim_agg": "mean",
"adj_sim_split_head": false,
"num_triggering_steps": 1,
"use_shared_dropout_proj": false,
"use_layer_norm_b4_biaffine": false,
"remove_mention_type_layer_norm": false,
"use_token_drop": false,
"guessing_decode": false,
"max_clique_decode": true,
"try_to_make_up": false,
"self_loop": false,
"incremental_min_conn": -1,
"use_span_self_att": false,
"use_smooth_span_self_att": false,
"ment_feature_type": "plus",
"ment_type_hidden_size": 32,
"num_mention_lstm_layer": 1,
"gat_alpha": 0.2,
"gat_num_heads": 4,
"gat_num_layers": 2,
"role_by_encoding": false,
"use_mention_lstm": false,
"mlp_before_adj_measure": false,
"use_field_cls_mlp": false,
"build_dense_connected_doc_graph": false,
"stop_gradient": false,
"doc_lang": "zh"
}
2022-02-23 15:51:43.264 | Level 20 | dee.tasks.base_task:logging:196 - ====================Init Device====================
2022-02-23 15:51:43.296 | Level 20 | dee.tasks.base_task:logging:196 - device cuda n_gpu 2 distributed training False
2022-02-23 15:51:43.296 | Level 20 | dee.tasks.base_task:logging:196 - ====================Reset Random Seed to 99====================
2022-02-23 15:51:43.297 | Level 20 | dee.tasks.base_task:logging:196 - Init Summary Writer
/root/anaconda3/envs/zhtorch/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/root/anaconda3/envs/zhtorch/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/root/anaconda3/envs/zhtorch/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/root/anaconda3/envs/zhtorch/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/root/anaconda3/envs/zhtorch/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/root/anaconda3/envs/zhtorch/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
2022-02-23 15:51:44.384 | Level 20 | dee.tasks.base_task:logging:196 - Writing summary into ./Exps/jiao/Summary/Summary-Feb23_15-51-43
2022-02-23 15:51:44.384 | Level 20 | dee.tasks.base_task:logging:196 - Initializing DEETask
file bert/config.json not found
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is 'BertTokenizer'.
The class this function is called from is 'BertTokenizerForDocEE'.
[('Build', ['CompanyName', 'Product', 'Address', 'StartTime', 'Country'], {1: ['CompanyName'], 2: ['CompanyName', 'StartTime'], 3: ['CompanyName', 'Product', 'StartTime'], 4: ['Address', 'CompanyName', 'Product', 'StartTime'], 5: ['Address', 'CompanyName', 'Country', 'Product', 'StartTime'], 'all': ['CompanyName', 'Product', 'Address', 'StartTime', 'Country']}, 5), ('Violated', ['CompanyName', 'Law', 'StartTime', 'Address', 'Character'], {1: ['CompanyName'], 2: ['CompanyName', 'StartTime'], 3: ['Character', 'CompanyName', 'StartTime'], 4: ['Address', 'Character', 'CompanyName', 'StartTime'], 5: ['Address', 'Character', 'CompanyName', 'Law', 'StartTime'], 'all': ['CompanyName', 'Law', 'StartTime', 'Address', 'Character']}, 5)]
2022-02-23 15:51:44.651 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_embedding.token_embedding.weight torch.Size([21128, 768]) 16226304
2022-02-23 15:51:44.651 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_embedding.pos_embedding.weight torch.Size([128, 768]) 98304
2022-02-23 15:51:44.651 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_embedding.layer_norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.652 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_embedding.layer_norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.652 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.self_attn.linears.0.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.652 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.self_attn.linears.0.bias torch.Size([768]) 768
2022-02-23 15:51:44.652 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.self_attn.linears.1.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.652 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.self_attn.linears.1.bias torch.Size([768]) 768
2022-02-23 15:51:44.653 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.self_attn.linears.2.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.653 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.self_attn.linears.2.bias torch.Size([768]) 768
2022-02-23 15:51:44.653 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.self_attn.linears.3.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.653 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.self_attn.linears.3.bias torch.Size([768]) 768
2022-02-23 15:51:44.653 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
2022-02-23 15:51:44.653 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.feed_forward.w_1.bias torch.Size([1024]) 1024
2022-02-23 15:51:44.654 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
2022-02-23 15:51:44.654 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.feed_forward.w_2.bias torch.Size([768]) 768
2022-02-23 15:51:44.654 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.sublayer.0.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.654 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.sublayer.0.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.654 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.sublayer.1.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.654 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.0.sublayer.1.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.655 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.self_attn.linears.0.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.655 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.self_attn.linears.0.bias torch.Size([768]) 768
2022-02-23 15:51:44.655 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.self_attn.linears.1.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.655 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.self_attn.linears.1.bias torch.Size([768]) 768
2022-02-23 15:51:44.655 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.self_attn.linears.2.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.655 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.self_attn.linears.2.bias torch.Size([768]) 768
2022-02-23 15:51:44.656 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.self_attn.linears.3.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.656 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.self_attn.linears.3.bias torch.Size([768]) 768
2022-02-23 15:51:44.656 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
2022-02-23 15:51:44.656 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.feed_forward.w_1.bias torch.Size([1024]) 1024
2022-02-23 15:51:44.656 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
2022-02-23 15:51:44.656 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.feed_forward.w_2.bias torch.Size([768]) 768
2022-02-23 15:51:44.657 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.sublayer.0.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.657 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.sublayer.0.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.657 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.sublayer.1.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.657 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.1.sublayer.1.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.657 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.self_attn.linears.0.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.657 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.self_attn.linears.0.bias torch.Size([768]) 768
2022-02-23 15:51:44.658 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.self_attn.linears.1.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.658 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.self_attn.linears.1.bias torch.Size([768]) 768
2022-02-23 15:51:44.658 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.self_attn.linears.2.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.658 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.self_attn.linears.2.bias torch.Size([768]) 768
2022-02-23 15:51:44.658 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.self_attn.linears.3.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.658 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.self_attn.linears.3.bias torch.Size([768]) 768
2022-02-23 15:51:44.659 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
2022-02-23 15:51:44.659 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.feed_forward.w_1.bias torch.Size([1024]) 1024
2022-02-23 15:51:44.659 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
2022-02-23 15:51:44.659 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.feed_forward.w_2.bias torch.Size([768]) 768
2022-02-23 15:51:44.659 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.sublayer.0.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.660 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.sublayer.0.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.660 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.sublayer.1.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.660 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.2.sublayer.1.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.660 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.self_attn.linears.0.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.660 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.self_attn.linears.0.bias torch.Size([768]) 768
2022-02-23 15:51:44.660 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.self_attn.linears.1.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.661 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.self_attn.linears.1.bias torch.Size([768]) 768
2022-02-23 15:51:44.661 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.self_attn.linears.2.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.661 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.self_attn.linears.2.bias torch.Size([768]) 768
2022-02-23 15:51:44.661 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.self_attn.linears.3.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.661 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.self_attn.linears.3.bias torch.Size([768]) 768
2022-02-23 15:51:44.661 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
2022-02-23 15:51:44.662 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.feed_forward.w_1.bias torch.Size([1024]) 1024
2022-02-23 15:51:44.662 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
2022-02-23 15:51:44.662 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.feed_forward.w_2.bias torch.Size([768]) 768
2022-02-23 15:51:44.662 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.sublayer.0.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.662 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.sublayer.0.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.662 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.sublayer.1.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.663 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.layers.3.sublayer.1.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.663 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.663 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.token_encoder.norm.betatorch.Size([768]) 768
2022-02-23 15:51:44.663 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.crf_layer.trans_mat torch.Size([17, 17]) 289
2022-02-23 15:51:44.663 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.crf_layer.hidden2tag.weight torch.Size([17, 768]) 13056
2022-02-23 15:51:44.663 | INFO | dee.tasks.dee_task:init:377 - Trainable: ner_model.crf_layer.hidden2tag.bias torch.Size([17]) 17
2022-02-23 15:51:44.664 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.event_query torch.Size([1, 768]) 768
2022-02-23 15:51:44.664 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.event_cls.weight torch.Size([2, 768]) 1536
2022-02-23 15:51:44.664 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.event_cls.bias torch.Size([2]) 2
2022-02-23 15:51:44.664 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.0.weight torch.Size([2, 768]) 1536
2022-02-23 15:51:44.664 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.0.bias torch.Size([2]) 2
2022-02-23 15:51:44.664 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.1.weight torch.Size([2, 768]) 1536
2022-02-23 15:51:44.665 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.1.bias torch.Size([2]) 2
2022-02-23 15:51:44.665 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.2.weight torch.Size([2, 768]) 1536
2022-02-23 15:51:44.665 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.2.bias torch.Size([2]) 2
2022-02-23 15:51:44.665 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.3.weight torch.Size([2, 768]) 1536
2022-02-23 15:51:44.665 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.3.bias torch.Size([2]) 2
2022-02-23 15:51:44.665 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.4.weight torch.Size([2, 768]) 1536
2022-02-23 15:51:44.666 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_cls_list.4.bias torch.Size([2]) 2
2022-02-23 15:51:44.666 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_queries.0 torch.Size([1, 768]) 768
2022-02-23 15:51:44.666 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_queries.1 torch.Size([1, 768]) 768
2022-02-23 15:51:44.666 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_queries.2 torch.Size([1, 768]) 768
2022-02-23 15:51:44.666 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_queries.3 torch.Size([1, 768]) 768
2022-02-23 15:51:44.666 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.0.field_queries.4 torch.Size([1, 768]) 768
2022-02-23 15:51:44.667 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.event_query torch.Size([1, 768]) 768
2022-02-23 15:51:44.667 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.event_cls.weight torch.Size([2, 768]) 1536
2022-02-23 15:51:44.667 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.event_cls.bias torch.Size([2]) 2
2022-02-23 15:51:44.667 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.0.weight torch.Size([2, 768]) 1536
2022-02-23 15:51:44.667 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.0.bias torch.Size([2]) 2
2022-02-23 15:51:44.667 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.1.weight torch.Size([2, 768]) 1536
2022-02-23 15:51:44.668 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.1.bias torch.Size([2]) 2
2022-02-23 15:51:44.668 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.2.weight torch.Size([2, 768]) 1536
2022-02-23 15:51:44.668 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.2.bias torch.Size([2]) 2
2022-02-23 15:51:44.668 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.3.weight torch.Size([2, 768]) 1536
2022-02-23 15:51:44.668 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.3.bias torch.Size([2]) 2
2022-02-23 15:51:44.668 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.4.weight torch.Size([2, 768]) 1536
2022-02-23 15:51:44.669 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_cls_list.4.bias torch.Size([2]) 2
2022-02-23 15:51:44.669 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_queries.0 torch.Size([1, 768]) 768
2022-02-23 15:51:44.669 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_queries.1 torch.Size([1, 768]) 768
2022-02-23 15:51:44.669 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_queries.2 torch.Size([1, 768]) 768
2022-02-23 15:51:44.669 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_queries.3 torch.Size([1, 768]) 768
2022-02-23 15:51:44.669 | INFO | dee.tasks.dee_task:init:377 - Trainable: event_tables.1.field_queries.4 torch.Size([1, 768]) 768
2022-02-23 15:51:44.670 | INFO | dee.tasks.dee_task:init:377 - Trainable: sent_pos_encoder.embedding.weighttorch.Size([64, 768]) 49152
2022-02-23 15:51:44.670 | INFO | dee.tasks.dee_task:init:377 - Trainable: sent_pos_encoder.layer_norm.gammatorch.Size([768]) 768
2022-02-23 15:51:44.670 | INFO | dee.tasks.dee_task:init:377 - Trainable: sent_pos_encoder.layer_norm.betatorch.Size([768]) 768
2022-02-23 15:51:44.670 | INFO | dee.tasks.dee_task:init:377 - Trainable: ment_type_encoder.embedding.weight torch.Size([15, 768]) 11520
2022-02-23 15:51:44.670 | INFO | dee.tasks.dee_task:init:377 - Trainable: ment_type_encoder.layer_norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.670 | INFO | dee.tasks.dee_task:init:377 - Trainable: ment_type_encoder.layer_norm.betatorch.Size([768]) 768
2022-02-23 15:51:44.671 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.self_attn.linears.0.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.671 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.self_attn.linears.0.bias torch.Size([768]) 768
2022-02-23 15:51:44.671 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.self_attn.linears.1.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.671 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.self_attn.linears.1.bias torch.Size([768]) 768
2022-02-23 15:51:44.671 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.self_attn.linears.2.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.671 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.self_attn.linears.2.bias torch.Size([768]) 768
2022-02-23 15:51:44.672 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.self_attn.linears.3.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.672 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.self_attn.linears.3.bias torch.Size([768]) 768
2022-02-23 15:51:44.672 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
2022-02-23 15:51:44.672 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.feed_forward.w_1.bias torch.Size([1024]) 1024
2022-02-23 15:51:44.672 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
2022-02-23 15:51:44.673 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.feed_forward.w_2.bias torch.Size([768]) 768
2022-02-23 15:51:44.673 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.sublayer.0.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.673 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.sublayer.0.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.673 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.sublayer.1.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.673 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.0.sublayer.1.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.673 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.self_attn.linears.0.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.674 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.self_attn.linears.0.bias torch.Size([768]) 768
2022-02-23 15:51:44.674 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.self_attn.linears.1.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.674 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.self_attn.linears.1.bias torch.Size([768]) 768
2022-02-23 15:51:44.674 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.self_attn.linears.2.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.674 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.self_attn.linears.2.bias torch.Size([768]) 768
2022-02-23 15:51:44.674 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.self_attn.linears.3.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.675 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.self_attn.linears.3.bias torch.Size([768]) 768
2022-02-23 15:51:44.675 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
2022-02-23 15:51:44.675 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.feed_forward.w_1.bias torch.Size([1024]) 1024
2022-02-23 15:51:44.675 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
2022-02-23 15:51:44.675 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.feed_forward.w_2.bias torch.Size([768]) 768
2022-02-23 15:51:44.675 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.sublayer.0.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.676 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.sublayer.0.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.676 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.sublayer.1.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.676 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.1.sublayer.1.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.676 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.self_attn.linears.0.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.676 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.self_attn.linears.0.bias torch.Size([768]) 768
2022-02-23 15:51:44.676 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.self_attn.linears.1.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.677 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.self_attn.linears.1.bias torch.Size([768]) 768
2022-02-23 15:51:44.677 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.self_attn.linears.2.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.677 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.self_attn.linears.2.bias torch.Size([768]) 768
2022-02-23 15:51:44.677 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.self_attn.linears.3.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.677 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.self_attn.linears.3.bias torch.Size([768]) 768
2022-02-23 15:51:44.677 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
2022-02-23 15:51:44.678 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.feed_forward.w_1.bias torch.Size([1024]) 1024
2022-02-23 15:51:44.678 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
2022-02-23 15:51:44.678 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.feed_forward.w_2.bias torch.Size([768]) 768
2022-02-23 15:51:44.678 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.sublayer.0.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.678 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.sublayer.0.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.678 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.sublayer.1.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.679 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.2.sublayer.1.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.679 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.self_attn.linears.0.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.679 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.self_attn.linears.0.bias torch.Size([768]) 768
2022-02-23 15:51:44.679 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.self_attn.linears.1.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.679 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.self_attn.linears.1.bias torch.Size([768]) 768
2022-02-23 15:51:44.679 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.self_attn.linears.2.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.680 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.self_attn.linears.2.bias torch.Size([768]) 768
2022-02-23 15:51:44.680 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.self_attn.linears.3.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.680 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.self_attn.linears.3.bias torch.Size([768]) 768
2022-02-23 15:51:44.680 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
2022-02-23 15:51:44.680 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.feed_forward.w_1.bias torch.Size([1024]) 1024
2022-02-23 15:51:44.680 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
2022-02-23 15:51:44.681 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.feed_forward.w_2.bias torch.Size([768]) 768
2022-02-23 15:51:44.681 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.sublayer.0.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.681 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.sublayer.0.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.681 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.sublayer.1.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.681 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.layers.3.sublayer.1.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.681 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.682 | INFO | dee.tasks.dee_task:init:377 - Trainable: doc_context_encoder.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.682 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.self_attn.linears.0.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.682 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.self_attn.linears.0.bias torch.Size([768]) 768
2022-02-23 15:51:44.682 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.self_attn.linears.1.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.682 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.self_attn.linears.1.bias torch.Size([768]) 768
2022-02-23 15:51:44.682 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.self_attn.linears.2.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.683 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.self_attn.linears.2.bias torch.Size([768]) 768
2022-02-23 15:51:44.683 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.self_attn.linears.3.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.683 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.self_attn.linears.3.bias torch.Size([768]) 768
2022-02-23 15:51:44.683 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
2022-02-23 15:51:44.683 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.feed_forward.w_1.bias torch.Size([1024]) 1024
2022-02-23 15:51:44.683 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
2022-02-23 15:51:44.684 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.feed_forward.w_2.bias torch.Size([768]) 768
2022-02-23 15:51:44.684 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.sublayer.0.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.684 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.sublayer.0.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.684 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.sublayer.1.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.684 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.0.sublayer.1.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.684 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.self_attn.linears.0.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.685 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.self_attn.linears.0.bias torch.Size([768]) 768
2022-02-23 15:51:44.685 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.self_attn.linears.1.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.685 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.self_attn.linears.1.bias torch.Size([768]) 768
2022-02-23 15:51:44.685 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.self_attn.linears.2.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.685 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.self_attn.linears.2.bias torch.Size([768]) 768
2022-02-23 15:51:44.686 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.self_attn.linears.3.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.686 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.self_attn.linears.3.bias torch.Size([768]) 768
2022-02-23 15:51:44.686 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
2022-02-23 15:51:44.686 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.feed_forward.w_1.bias torch.Size([1024]) 1024
2022-02-23 15:51:44.686 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
2022-02-23 15:51:44.686 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.feed_forward.w_2.bias torch.Size([768]) 768
2022-02-23 15:51:44.687 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.sublayer.0.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.687 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.sublayer.0.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.687 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.sublayer.1.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.687 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.1.sublayer.1.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.687 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.self_attn.linears.0.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.687 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.self_attn.linears.0.bias torch.Size([768]) 768
2022-02-23 15:51:44.688 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.self_attn.linears.1.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.688 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.self_attn.linears.1.bias torch.Size([768]) 768
2022-02-23 15:51:44.688 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.self_attn.linears.2.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.688 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.self_attn.linears.2.bias torch.Size([768]) 768
2022-02-23 15:51:44.688 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.self_attn.linears.3.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.688 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.self_attn.linears.3.bias torch.Size([768]) 768
2022-02-23 15:51:44.689 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
2022-02-23 15:51:44.689 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.feed_forward.w_1.bias torch.Size([1024]) 1024
2022-02-23 15:51:44.689 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
2022-02-23 15:51:44.689 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.feed_forward.w_2.bias torch.Size([768]) 768
2022-02-23 15:51:44.689 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.sublayer.0.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.689 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.sublayer.0.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.690 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.sublayer.1.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.690 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.2.sublayer.1.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.690 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.self_attn.linears.0.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.690 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.self_attn.linears.0.bias torch.Size([768]) 768
2022-02-23 15:51:44.690 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.self_attn.linears.1.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.690 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.self_attn.linears.1.bias torch.Size([768]) 768
2022-02-23 15:51:44.691 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.self_attn.linears.2.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.691 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.self_attn.linears.2.bias torch.Size([768]) 768
2022-02-23 15:51:44.691 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.self_attn.linears.3.weight torch.Size([768, 768]) 589824
2022-02-23 15:51:44.691 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.self_attn.linears.3.bias torch.Size([768]) 768
2022-02-23 15:51:44.691 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.feed_forward.w_1.weight torch.Size([1024, 768]) 786432
2022-02-23 15:51:44.691 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.feed_forward.w_1.bias torch.Size([1024]) 1024
2022-02-23 15:51:44.692 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.feed_forward.w_2.weight torch.Size([768, 1024]) 786432
2022-02-23 15:51:44.692 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.feed_forward.w_2.bias torch.Size([768]) 768
2022-02-23 15:51:44.692 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.sublayer.0.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.692 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.sublayer.0.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.692 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.sublayer.1.norm.gamma torch.Size([768]) 768
2022-02-23 15:51:44.692 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.layers.3.sublayer.1.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.693 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.norm.gammatorch.Size([768]) 768
2022-02-23 15:51:44.693 | INFO | dee.tasks.dee_task:init:377 - Trainable: field_context_encoder.norm.beta torch.Size([768]) 768
2022-02-23 15:51:44.693 | INFO | dee.tasks.dee_task:init:389 - #Total Trainable Parameters: 63716682
2022-02-23 15:51:44.693 | INFO | dee.tasks.dee_task:init:390 - #Total Fixed Parameters: 0
2022-02-23 15:51:44.693 | Level 20 | dee.tasks.base_task:logging:196 - ====================Decorate Model====================
Traceback (most recent call last):
File "/home/jiaojiaxin/DocEE/run_dee_task.py", line 208, in
parallel_decorate=in_argv.parallel_decorate,
File "/home/jiaojiaxin/DocEE/dee/tasks/dee_task.py", line 392, in init
self._decorate_model(parallel_decorate=parallel_decorate)
File "/home/jiaojiaxin/DocEE/dee/tasks/base_task.py", line 474, in _decorate_model
self.model.to(self.device)
File "/root/anaconda3/envs/zhtorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 612, in to
return self._apply(convert)
File "/root/anaconda3/envs/zhtorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 359, in _apply
module._apply(fn)
File "/root/anaconda3/envs/zhtorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 359, in _apply
module._apply(fn)
File "/root/anaconda3/envs/zhtorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 359, in _apply
module._apply(fn)
File "/root/anaconda3/envs/zhtorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 381, in _apply
param_applied = fn(param)
File "/root/anaconda3/envs/zhtorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 610, in convert
return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
RuntimeError: CUDA error: out of memory
您好老师, 我在看这几个模型的时候,有一些疑问,请问GIT在做ner的时候,是使用的几层Transformer呢?
构成异构图的时候,GCN是输出的是什么呢?包括后续事件检测时,使用的是什么来代表文档向量进行事件分类的呢?
看论文和代码的时候有些没看懂,希望得到老师您的解答。
谢谢🙏
x
to check the agreement items.README.md
.您好,我的问题1::在最新的inference.py里我发现filter_event_type里出现了个unk,这个是在什么情况使用,我记得之前只支持o2o,o2m,m2m.overall
问题2:您对于开放域的中文新闻中的事件检测模型有什么了解或者推荐么?由于新闻的表达多样性和数据分布,无法直接使用您的模型预测。仍然会预测出大量错误的事件(不含事件的、其他语义类似事件但未包括在已定义事件错分为定义的事件,所以需要在前置加入事件检测部分。
我目前使用了触发词触发事件(保证不筛入完全不含事件的情况,但可能产生大量误召回以及筛掉含部分未收录触发词的已定义事件)+后续根据您的模型抽取结果和文章向量卡阈值的方法(筛出低置信度的结果,保证已抽出结果准确性),但依旧没能解决
Environment | Values |
---|---|
System | Windows/Linux |
GPU Device | |
CUDA Version | |
Python Version | |
PyTorch Version | |
dee (the Toolkit) Version |
Log:
** Idea sharing **
While sharing what you want to do, make sure to protect your ideas.
** Problems **
If you have any questions about event extraction, make sure you have read the latest papers or searched on the Internet.
您好,很高心能有您这样的开源工作者,我在实践中遇到了些问题,想向您请教一下。
1、我在自己构造数据集时,发现一个有意思的现象。在标注了trigger的情况下,如下代码两种构建template方式,在build_data.py)中构造数据集时,两种方式下add_triggers都设置为False。得到的结论差距恨到,第二种的F1值高很多,您知道什么原因吗?
2、另外我想 问的是TRIGGERS设置那么多模式最终选哪种模式,是根据我们设置的num_triggers参数选择吗?
3、在最终评价的时候有些field对应为null,这种在评价的时候会自动过滤吗?
期待您的回复
class MarryEvent(BaseEvent):
NAME = "Marry"
FIELDS = [
"Trigger",
'Marry_loc', 'Marry_wife', 'Marry_time', 'Marry_husband'
]
TRIGGERS = {
1: ['Marry_time'], # importance: 0.9686967372778184
2: ['Marry_husband', 'Marry_time'], # importance: 0.9842342342342343
3: ['Marry_husband', 'Marry_loc', 'Marry_time'], # importance: 0.9887387387387387
4: ['Marry_husband', 'Marry_loc', 'Marry_time', 'Trigger'], # importance: 0.9887387387387387
5: ['Marry_husband', 'Marry_loc', 'Marry_time', 'Marry_wife', 'Trigger'], # importance: 0.9887387387387387
}
TRIGGERS['all'] = ['Marry_time', 'Marry_loc', 'Marry_husband', 'Marry_wife', 'Trigger']
class MarryEvent(BaseEvent):
NAME = "Marry"
FIELDS = ["Trigger",'loc', 'wife', 'time', 'husband']
TRIGGERS = {
1: ["Trigger"],
2: ["Trigger", 'loc'],
3: ["Trigger", 'loc', 'wife'],
4: ["Trigger", 'loc', 'wife', 'time'],
5: ["Trigger", 'loc', 'wife', 'time', 'husband'],
}
TRIGGERS["all"] = ["Trigger",'loc', 'wife', 'time', 'husband']
def __init__(self, recguid=None):
super().__init__(self.FIELDS, event_name=self.NAME, recguid=recguid)
self.set_key_fields(self.TRIGGERS)
** Others **
Other things you may want to share or discuss.
** Idea sharing **
While sharing what you want to do, make sure to protect your ideas.
** Problems **
If you have any questions about event extraction, make sure you have read the latest papers or searched on the Internet.
** Others **
Other things you may want to share or discuss.
Hello Spico, I'm reproducing your paper and use your online demo. I find a new event will be misclassified. So Can help model classify the event list as null if I add some empty samples. like this:
{text:'I love China',"event_list:"[]
** Idea sharing **
While sharing what you want to do, make sure to protect your ideas.
** Problems **
嗨,我跟着您的repo 训练完了'PTPCG_P1-DuEE_fin-woTgg-wOtherType'任务(即|R|=1, Tgg=×).对于Results文件夹的结果有些疑惑,特来请教。
total_results: [ { "ModelType": "TriggerAwarePrunedCompleteGraph", "Total": { "precision": "68.8", "recall": "62.4", "f1": "65.5" }]
请问total_results的f1值是按照下图(图1)的方式计算的吗
其次
"m2m": { "classification": { "precision": "94.696", "recall": "93.718", "f1": "94.204" }, "entity": { "precision": "80.362", "recall": "85.863", "f1": "83.022" }, "combination": { "precision": "22.823", "recall": "24.050", "f1": "23.421" }, "rawCombination": { "precision": "22.448", "recall": "21.756", "f1": "22.097" }, "overall": { "precision": "68.838", "recall": "62.408", "f1": "65.465" }, "instance": { "precision": "21.141", "recall": "22.977", "f1": "22.021" } }
这里的combination的分数是论元的提取分数吗,如果是,则22的f1值是否说明该模型对论元提取任务的参考欠佳。
** Others **
Other things you may want to share or discuss.
感谢分享。请问卡数的要求和限制(如:Tip: At least 4 * NVIDIA V100 GPU (32GB) cards are required to run GIT models.)是为了训练速度考虑吗,还是对模型训练后的表现也有影响?
降低Batch_size或者增加梯度累计可以用更少的资源来训练(虽然训练时间更长了),不知道作者有没有试过这样做对最终模型效果的影响
老师您好,在使用单机多卡的时候,会出现以下报错:
Traceback (most recent call last):
File "/data/home/qianbenchen/DocEE-main/dee/tasks/dee_task.py", line 587, in get_loss_on_batch
teacher_prob=teacher_prob,
File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 161, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 171, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
output.reraise()
File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
IndexError: Caught IndexError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/data/home/qianbenchen/DocEE-main/dee/models/trigger_aware.py", line 172, in forward
ent_fix_mode=self.config.ent_fix_mode,
File "/data/home/qianbenchen/DocEE-main/dee/modules/doc_info.py", line 305, in get_doc_arg_rel_info_list
) = get_span_mention_info(span_dranges_list, doc_token_type_mat)
File "/data/home/qianbenchen/DocEE-main/dee/modules/doc_info.py", line 16, in get_span_mention_info
mention_type_list.append(doc_token_type_list[sent_idx][char_s])
IndexError: list index out of range
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "run_dee_task.py", line 274, in
dee_task.train(save_cpt_flag=in_argv.save_cpt_flag)
File "/data/home/qianbenchen/DocEE-main/dee/tasks/dee_task.py", line 656, in train
base_epoch_idx=resume_base_epoch,
File "/data/home/qianbenchen/DocEE-main/dee/tasks/base_task.py", line 693, in base_train
total_loss = get_loss_func(self, batch, **kwargs_dict1)
File "/data/home/qianbenchen/DocEE-main/dee/tasks/dee_task.py", line 598, in get_loss_on_batch
raise Exception("Cannot get the loss")
请问是否有得到解决呢?谢谢!
x
to check the agreement items.README.md
.您好,您论文Figure 1中涉及到了两个概念:mention type和argument role,并且在Entity Representation部分也使用到了mention type这一信息。这两者的含义和区别我明白,但是在duee数据集中有mention type这个标注信息吗?我检查了一下,似乎只有argument role。
Environment | Values |
---|---|
System | Windows/Linux |
GPU Device | |
CUDA Version | |
Python Version | |
PyTorch Version | |
dee (the Toolkit) Version |
Log:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.