paddlepaddle / knover Goto Github PK

Large-scale open domain KNOwledge grounded conVERsation system based on PaddlePaddle

License: Apache License 2.0

Python 91.50% Shell 8.50%

knover's Introduction

Knover

Knover is a toolkit for knowledge grounded dialogue generation based on PaddlePaddle. Knover allows researchers and developers to carry out efficient training/inference of large-scale dialogue generation models.

What's New:

October 2022: We are opening Q-TOD, a novel query-driven task-oriented dialogue system.
March 2022: We are opening PLATO-KAG, an unsupervised learning approach for end-to-end knowledge-grounded conversation modeling.
February 2022: We are opening our TOD-DA dataset, models and code in DSTC10-Track2.
December 2021: We are opening the dialogue generation model of PLATO-XL, with up to 11 billion parameters.
October 2021: We are opening AG-DST, an amendable generation for dialogue state tracking.
February 2021: We are opening our implementation (Team 19) in DSTC9-Track1.
July 2020: We are opening PLATO-2, a large-scale generative model with latent space for open-domain dialogue systems.

Requirements and Installation

python version >= 3.7
paddlepaddle-gpu version >= 2.4.0
- You can install PaddlePaddle following the instructions.
- The specific version of PaddlePaddle is also based on your CUDA version (recommended version: 10.1) and CuDNN version (recommended version: 7.6). See more information on PaddlePaddle document about GPU support
sentencepiece
termcolor
If you want to run distributed training, you'll also need NCCL
Install Knover locally:

git clone https://github.com/PaddlePaddle/Knover.git
cd Knover
pip3 install -e .

Or you can setup PYTHONPATH only:

export PYTHONPATH=/abs/path/to/Knover:$PYTHONPATH

Basic usage

See usage document.

Disclaimer

This project aims to facilitate further research progress in dialogue generation. Baidu is not responsible for the 3rd party's generation with the pre-trained system.

Contact information

For help or issues using Knover, please submit a GitHub issue.

knover's People

Contributors

Stargazers

Watchers

Forkers

sserdoubleh portia1026 zhuyaolin muximuxi acproject submergenbot bladetornado l3str4nge nicioliu pkulzb happydici charliezhugj hello-web lylyhs mruayan luweishuang scape1989 yongxinshi zhihaolzh yuweifamily dengkaidk fbyyyyuan shexianron2016 wyman190406 jzq950305 xiaoanshi joechen322 chikiuso number59 fiyen picpic2013 xiemoyuan liuhc001 sjx0451 lastrei tubbz-alt zhongerqiandan woyang13 cheeryoung79 lwzbuaa convobox youth123 hellomlwo ankitshah009 judelee19 muxinghan gdwangh shuhua886 caigao akari0216 judeqxq2018 obama88 kiyoli enhaofrank zhly000 mars-wei aiwjy94 yoonseokheo archerzjt rayshark duanzhihua zhyq mengyanyuan codingmice horsedongmin apsarageek avineshwar stjordanis seungguini polaraural worldeditors lllo5t zsc19 jinlmsft xiaoqiao dvhuang fadelmuli yangpuhai fuxn dph1983 arteezysmile shanetian zhangyanbo2007 stevenzhb dev-penghui zhanghaoie moonryul tooyoude vonderland xubenben wakafengfan liu-geng py703703 alice828 biubiuyi jamirando creazyfan m-aterialism testbestp moonyxxx

knover's Issues

Cublas Error - CUBLAS_STATUS_EXECUTION_FAILED with interact scipt

Hi, while running the interactive script for both the 24L and 32L models, I faced the following CUBLAS error.
I'm running the script on Ubuntu 18.04 with 4 Tesla T4 16GB GPUs on GCP.

W0715 13:57:42.359799 16069 device_context.cc:252] Please NOTE: device: 0, CUDA Capability: 75, Driver API Version: 11.0, Runtime API Version: 10.0
W0715 13:57:42.362675 16069 device_context.cc:260] device: 0, cuDNN Version: 8.0.
Load pretraining parameters from ./24L/Plato.
Enter [EXIT] to quit the interaction, [NEXT] to start a new conversation.
[Human]: hey
/home/bakht/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/executor.py:1070: UserWarning: The following exception is not an EOF exception.
  "The following exception is not an EOF exception.")
Traceback (most recent call last):
  File "./interaction.py", line 83, in <module>
    interact(args)
  File "./interaction.py", line 72, in interact
    pred = task.infer_step(model, data)[0]
  File "/mnt/disks/disk-huge/bakht/Knover/tasks/task_base.py", line 46, in infer_step
    predictions = model.infer_step(inputs)
  File "/mnt/disks/disk-huge/bakht/Knover/models/plato.py", line 243, in infer_step
    return super(Plato, self).infer_step(inputs)
  File "/mnt/disks/disk-huge/bakht/Knover/models/unified_transformer.py", line 506, in infer_step
    return self._run_generation(inputs)
  File "/mnt/disks/disk-huge/bakht/Knover/models/unified_transformer.py", line 462, in _run_generation
    return_numpy=False)
  File "/mnt/disks/disk-huge/bakht/Knover/models/model_base.py", line 258, in _execute
    fetch_vars = self.exe.run(program, feed, fetch_list, return_numpy=return_numpy)
  File "/home/bakht/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1071, in run
    six.reraise(*sys.exc_info())
  File "/home/bakht/anaconda3/envs/paddle/lib/python3.7/site-packages/six.py", line 703, in reraise
    raise value
  File "/home/bakht/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1066, in run
    return_merged=return_merged)
  File "/home/bakht/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1154, in _run_impl
    use_program_cache=use_program_cache)
  File "/home/bakht/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1229, in _run_program
    fetch_var_name)
paddle.fluid.core_avx.EnforceNotMet:

--------------------------------------------
C++ Call Stacks (More useful to developers):
--------------------------------------------
0   std::string paddle::platform::GetTraceBackString<char const*>(char const*&&, char const*, int)
1   paddle::platform::EnforceNotMet::EnforceNotMet(std::__exception_ptr::exception_ptr, char const*, int)
2   void paddle::operators::math::Blas<paddle::platform::CUDADeviceContext>::GEMM<float>(CBLAS_TRANSPOSE, CBLAS_TRANSPOSE, int, int, int, float, float const*, float const*, float, float*) const
3   void paddle::operators::math::Blas<paddle::platform::CUDADeviceContext>::MatMul<float>(paddle::framework::Tensor const&, paddle::operators::math::MatDescriptor const&, paddle::framework::Tensor const&, paddle::operators::math::MatDescriptor const&, float, paddle::framework::Tensor*, float) const
4   paddle::operators::MatMulKernel<paddle::platform::CUDADeviceContext, float>::Compute(paddle::framework::ExecutionContext const&) const
5   std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::MatMulKernel<paddle::platform::CUDADeviceContext, float>, paddle::operators::MatMulKernel<paddle::platform::CUDADeviceContext, double>, paddle::operators::MatMulKernel<paddle::platform::CUDADeviceContext, paddle::platform::float16> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&)
6   paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const
7   paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const
8   paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)
9   paddle::framework::Executor::RunPartialPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, long, long, bool, bool, bool)
10  paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool)
11  paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocator<std::string> > const&, bool, bool)

------------------------------------------
Python Call Stacks (More useful to users):
------------------------------------------
  File "/home/bakht/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2610, in append_op
    attrs=kwargs.get("attrs", None))
  File "/home/bakht/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/layer_helper.py", line 43, in append_op
    return self.main_program.current_block().append_op(*args, **kwargs)
  File "/home/bakht/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/layers/nn.py", line 6414, in matmul
    attrs=attrs)
  File "/mnt/disks/disk-huge/bakht/Knover/models/plato.py", line 194, in forward
    latent_emb = layers.matmul(x=weights, y=latent_embeddings, transpose_y=True)
  File "/mnt/disks/disk-huge/bakht/Knover/models/model_base.py", line 90, in _build_programs
    outputs = self.forward(inputs, is_infer=True)
  File "/mnt/disks/disk-huge/bakht/Knover/models/model_base.py", line 74, in __init__
    self._build_programs()
  File "/mnt/disks/disk-huge/bakht/Knover/models/unified_transformer.py", line 98, in __init__
    super(UnifiedTransformer, self).__init__(args, place)
  File "/mnt/disks/disk-huge/bakht/Knover/models/plato.py", line 50, in __init__
    super(Plato, self).__init__(args, place)
  File "/mnt/disks/disk-huge/bakht/Knover/models/__init__.py", line 49, in create_model
    return MODEL_REGISTRY[args.model](args, place)
  File "./interaction.py", line 54, in interact
    model = models.create_model(args, place)
  File "./interaction.py", line 83, in <module>
    interact(args)

----------------------
Error Message Summary:
----------------------
ExternalError:  Cublas error, CUBLAS_STATUS_EXECUTION_FAILED  at (/paddle/paddle/fluid/operators/math/blas_impl.cu.h:34)
  [operator < matmul > error]

AttributeError: 'Plato' object has no attribute 'args'

训练plato模型时报错：
File "/home/zzg/workspace/pycharm/Knover/knover/core/model.py", line 225, in load
self.args.start_step = start_step[0]
AttributeError: 'Plato' object has no attribute 'args'
查看model.py代码如下：
if is_checkpoint:
print(f"Load model from checkpoint: {model_path}")
start_step = get_tensor("@LR_DECAY_COUNTER@")
if start_step is not None:
self.args.start_step = start_step[0]
原因：初始化init时确实没有初始化args。
疑问：需要在init中加上self.args=args吗？感觉好像没用到self.args吧

UnifiedTransformerTokenizer' object has no attribute 'dialogue_encode'

'msg': "'UnifiedTransformerTokenizer' object has no attribute 'dialogue_encode'", 'results': '', 'status': '101'

请问use_role参数是怎么用的

请问use_role参数是标记什么的？是对话中的A和B吗？要怎么使用呢？

为什么vocab里必须既有[UNK]又有<unk>呢？

看代码的规则，vocab里既要有[UNK]又要有<unk>，否则会报错，这两个token都代表未知词吧，有什么区别吗？
另外我看例子中英语的vocab有些token的ids重复了，如下，不明白为什么，重复的id不会被覆盖吗？自己做vocab的时候也要改成重复的吗？
<unk> 0
<s> 1
</s> 2
[UNK] 0
[PAD] 0
[CLS] 1
[SEP] 2

sh ./scripts/24L_plato_training.sh报错，请帮忙看看是什么原因

我装的paddle版本是1.8.2.post107 paddlehub版本是1.5.3，错误信息如下：
ERROR 2020-07-21 14:44:20,106 utils.py:422] ABORT!!! Out of all 2 trainers, the trainer process with rank=[0] was aborted. Please check its log.
Traceback (most recent call last):
File "/home/li.ma/anaconda3/lib/python3.7/site-packages/paddle/distributed/utils.py", line 406, in watch_local_trainers
terminate_local_procs(procs)
File "/home/li.ma/anaconda3/lib/python3.7/site-packages/paddle/distributed/utils.py", line 257, in terminate_local_procs
p.proc.join(timeout=1)
AttributeError: 'Popen' object has no attribute 'join'

有关训练效果

530w数据，从头训stage1, stage2.1 。仍然明显有safe response&重复的现象，请问是我训练的不够充分吗？
stage1 batch_size=16 训练了320000 step，stage2.1训练了batch_size=1024 18000step

改成选随机的候选感觉好一些。

训练代码中对角色信息是怎么处理的

百度的预训练模型很强大，推理时可以将角色放在上下文中做出响应。我想用自己的数据做加入自己设定的角色进行训练，参照了数据example/train.tsv中加入your persona:的做法，但查看代码dialog_reader.py中，里面并没有针对"your persona:"字段做特殊处理，请问百度训练时，对带有"your persona:"的信息是怎么处理的？

Incorrect arguments in bash scripts

This line in projects/Plato-2/README.md:

bash ./scripts/local/job.sh ./project/PLATO-2/pretrain/24L_inference.conf

should be:

bash ./scripts/local/job.sh ./projecst/PLATO-2/pretrain/24L_infer.conf

Same for all similar lines...

Exception: 'feed_targets' does not have label_pos variable

Traceback (most recent call last):
File "/home/perfectworld/anaconda3/envs/dialogue/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/perfectworld/anaconda3/envs/dialogue/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/perfectworld/gx/Knover/knover/scripts/interact.py", line 83, in
interact(args)
File "/home/perfectworld/gx/Knover/knover/scripts/interact.py", line 70, in interact
pred = task.infer_step(model, data)[0]
File "/home/perfectworld/gx/Knover/knover/core/task.py", line 46, in infer_step
outputs = self._post_process_infer_output(predictions)
File "/home/perfectworld/gx/Knover/knover/tasks/dialog_generation.py", line 162, in _post_process_infer_output
return self._post_process_generation_output(predictions)
File "/home/perfectworld/gx/Knover/knover/tasks/dialog_generation.py", line 91, in _post_process_generation_output
get_nsp_score_batch(self.nsp_predictor, predictions)
File "/home/perfectworld/gx/Knover/knover/tasks/dialog_generation.py", line 404, in get_nsp_score_batch
outputs = nsp_predictor(data)
File "/home/perfectworld/gx/Knover/knover/utils/inference_utils.py", line 44, in predict
return_numpy=True)
File "/home/perfectworld/anaconda3/envs/dialogue/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1110, in run
six.reraise(*sys.exc_info())
File "/home/perfectworld/anaconda3/envs/dialogue/lib/python3.7/site-packages/six.py", line 703, in reraise
raise value
File "/home/perfectworld/anaconda3/envs/dialogue/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1108, in run
return_merged=return_merged)
File "/home/perfectworld/anaconda3/envs/dialogue/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1238, in _run_impl
use_program_cache=use_program_cache)
File "/home/perfectworld/anaconda3/envs/dialogue/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1313, in _run_program
fetch_var_name=fetch_var_name)
File "/home/perfectworld/anaconda3/envs/dialogue/lib/python3.7/site-packages/paddle/fluid/executor.py", line 624, in _add_feed_fetch_ops
if not has_feed_operators(global_block, feed, feed_var_name):
File "/home/perfectworld/anaconda3/envs/dialogue/lib/python3.7/site-packages/paddle/fluid/executor.py", line 280, in has_feed_operators
format(feed_target_name))
Exception: 'feed_targets' does not have label_pos variable

Dear sir, do you meet this problem? How to fix it?

没有预训练模型weight吗？

没看懂，模型是英文，还是中文

An PLATO-2 inference error for data pre-process???

I try to use plato to infer a example data with the instrunction of https://github.com/PaddlePaddle/Knover/tree/develop/projects/PLATO-2.

But I encounter an error below. And my code branch is develop and paddle is 2.0.1.

could you give me some help for this issue?

aistudio@jupyter-208728-1765888:~/develop/Knover$ git branch

develop
aistudio@jupyter-208728-1765888:/develop/Knover$ pip list | grep paddle
paddlehub 2.0.4
paddlenlp 2.0.0rc7
paddlepaddle-gpu 2.0.1.post101
tb-paddle 0.3.6
aistudio@jupyter-208728-1765888:/develop/Knover$ ./scripts/local/job.sh ./projects/PLATO-2/pretrain/24L_infer.conf

[[ 1 == 1 ]]
job_conf=./projects/PLATO-2/pretrain/24L_infer.conf
source ./projects/PLATO-2/pretrain/24L_infer.conf
++ job_script=./scripts/distributed/infer.sh
++ model=Plato
++ task=DialogGeneration
++ vocab_path=./package/dialog_en/vocab.txt
++ spm_model_file=./package/dialog_en/spm.model
++ infer_file=./data/dailydialog_test_60.tsv
++ data_format=raw
++ file_format=file
++ config_path=./projects/PLATO-2/24L.json
++ init_params=./24L/Plato
++ nsp_init_params=./24L/NSP
++ in_tokens=false
++ batch_size=5
++ log_steps=1
++ log_dir=./log
++ save_path=./output
++ output_name=response
++ infer_args='--ranking_score nsp_score'
export FLAGS_sync_nccl_allreduce=1
FLAGS_sync_nccl_allreduce=1
export FLAGS_fuse_parameter_memory_size=64
FLAGS_fuse_parameter_memory_size=64
mkdir -p ./output
[[ ./log != '' ]]
distributed_args=' --log_dir ./log'
[[ ./24L/NSP != '' ]]
[[ ! -e ./24L/NSP/model ]]
infer_args='--ranking_score nsp_score --nsp_inference_model_path ./24L/NSP'
python -m paddle.distributed.launch --log_dir ./log ./knover/scripts/infer.py --is_distributed true --model Plato --task DialogGeneration --vocab_path ./package/dialog_en/vocab.txt --do_lower_case false --spm_model_file ./package/dialog_en/spm.model --init_pretraining_params ./24L/Plato --infer_file ./data/dailydialog_test_60.tsv --data_format raw --file_format file --config_path ./projects/PLATO-2/24L.json --output_name response --ranking_score nsp_score --nsp_inference_model_path ./24L/NSP --in_tokens false --batch_size 5 --save_path ./output
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/setuptools/depends.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
import imp
----------- Configuration Arguments -----------
gpus: None
heter_worker_num: None
heter_workers:
http_port: None
ips: 127.0.0.1
log_dir: ./log
nproc_per_node: None
server_num: None
servers:
training_script: ./knover/scripts/infer.py
training_script_args: ['--is_distributed', 'true', '--model', 'Plato', '--task', 'DialogGeneration', '--vocab_path', './package/dialog_en/vocab.txt', '--do_lower_case', 'false', '--spm_model_file', './package/dialog_en/spm.model', '--init_pretraining_params', './24L/Plato', '--infer_file', './data/dailydialog_test_60.tsv', '--data_format', 'raw', '--file_format', 'file', '--config_path', './projects/PLATO-2/24L.json', '--output_name', 'response', '--ranking_score', 'nsp_score', '--nsp_inference_model_path', './24L/NSP', '--in_tokens', 'false', '--batch_size', '5', '--save_path', './output']
worker_num: None
workers:

WARNING 2021-04-14 12:08:40,192 launch.py:316] Not found distinct arguments and compiled with cuda. Default use collective mode
launch train in GPU mode
INFO 2021-04-14 12:08:40,193 launch_utils.py:471] Local start 1 processes. First process distributed environment info (Only For Debug):
+=======================================================================================+
| Distributed Envs Value |
+---------------------------------------------------------------------------------------+
| PADDLE_TRAINER_ID 0 |
| PADDLE_CURRENT_ENDPOINT 127.0.0.1:56451 |
| PADDLE_TRAINERS_NUM 1 |
| PADDLE_TRAINER_ENDPOINTS 127.0.0.1:56451 |
| FLAGS_selected_gpus 0 |
+=======================================================================================+

INFO 2021-04-14 12:08:40,193 launch_utils.py:475] details abouts PADDLE_TRAINER_ENDPOINTS can be found in ./log/endpoints.log, and detail running logs maybe found in ./log/workerlog.0
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/setuptools/depends.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
import imp
{
"is_distributed": true,
"save_path": "./output",
"infer_file": "./data/dailydialog_test_60.tsv",
"output_name": "response",
"log_steps": 1,
"Model": {
"model": "Plato",
"config_path": "./projects/PLATO-2/24L.json",
"init_checkpoint": "",
"init_pretraining_params": "./24L/Plato",
"optimizer": "AdamW",
"learning_rate": 1e-05,
"warmup_steps": 0,
"lr_scheduler": "noam",
"max_training_steps": 2000,
"min_learning_rate": 0,
"weight_decay": 0.0,
"max_grad_norm": 0.1,
"use_recompute": false,
"use_amp": false,
"amp_loss_scaling": 32768.0,
"weight_sharing": true,
"mem_efficient": false,
"use_role": false,
"use_bow": true,
"use_entropy": false,
"pre_encoder_cmd": "d",
"preprocess_cmd": "n",
"postprocess_cmd": "da",
"post_cls_cmd": "n",
"cls_bias": true,
"attention_probs_dropout_prob": 0.1,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 1024,
"initializer_range": 0.02,
"max_position_embeddings": 256,
"latent_type_size": 20,
"num_attention_heads": 16,
"num_hidden_layers": 24,
"type_vocab_size": 2,
"vocab_size": 8001
},
"Generator": {
"min_dec_len": 1,
"max_dec_len": 64,
"decoding_strategy": "topk_sampling",
"temperature": 1.0,
"ignore_unk": true,
"num_samples": null,
"topk": 10,
"topp": 0.9,
"beam_size": 10,
"length_average": true,
"length_penalty": 0.0
},
"Task": {
"task": "DialogGeneration",
"do_generation": true,
"is_cn": false,
"filter_cross_repetition": true,
"nsp_inference_model_path": "./24L/NSP",
"ranking_score": "nsp_score"
},
"Reader": {
"max_src_len": 128,
"max_tgt_len": 128,
"max_seq_len": 256,
"max_knowledge_len": 0,
"knowledge_position": "post_src",
"knowledge_style": "original",
"truncate_first_turn": false,
"file_format": "file",
"data_format": "raw",
"in_tokens": false,
"batch_size": 5,
"position_style": "continuous",
"random_seed": 11,
"shuffle_pool_size": 0,
"sort_pool_size": 65536
},
"Tokenizer": {
"tokenizer": "SentencePieceTokenizer",
"vocab_path": "./package/dialog_en/vocab.txt",
"specials_path": "",
"do_lower_case": false,
"spm_model_file": "./package/dialog_en/spm.model"
},
"run_infer": true
}
W0414 12:08:41.338814 1234 device_context.cc:362] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.0, Runtime API Version: 10.1
W0414 12:08:41.343097 1234 device_context.cc:372] device: 0, cuDNN Version: 7.6.
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/develop/Knover/knover/models/unified_transformer.py:140
The behavior of expression A + B has been unified with elementwise_add(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_add(X, Y, axis=0) instead of A + B. This transitional warning will be dropped in the future.
op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/develop/Knover/knover/modules/transformer_block.py:113
The behavior of expression A + B has been unified with elementwise_add(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_add(X, Y, axis=0) instead of A + B. This transitional warning will be dropped in the future.
op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/develop/Knover/knover/modules/transformer_block.py:213
The behavior of expression A + B has been unified with elementwise_add(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_add(X, Y, axis=0) instead of A + B. This transitional warning will be dropped in the future.
op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:77: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
return (isinstance(seq, collections.Sequence) and
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/develop/Knover/knover/modules/generator.py:225
The behavior of expression A * B has been unified with elementwise_mul(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_mul(X, Y, axis=0) instead of A * B. This transitional warning will be dropped in the future.
op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/develop/Knover/knover/modules/generator.py:225
The behavior of expression A / B has been unified with elementwise_div(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_div(X, Y, axis=0) instead of A / B. This transitional warning will be dropped in the future.
op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/develop/Knover/knover/modules/generator.py:255
The behavior of expression A * B has been unified with elementwise_mul(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_mul(X, Y, axis=0) instead of A * B. This transitional warning will be dropped in the future.
op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/develop/Knover/knover/modules/generator.py:255
The behavior of expression A - B has been unified with elementwise_sub(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_sub(X, Y, axis=0) instead of A - B. This transitional warning will be dropped in the future.
op_type, op_type, EXPRESSION_MAP[method_name]))
Loading model from ./24L/Plato.
Load pretraining parameters from ./24L/Plato
Traceback (most recent call last):
File "./knover/scripts/infer.py", line 140, in
infer(args)
File "./knover/scripts/infer.py", line 81, in infer
predictions = task.infer_step(model, data)
File "/home/aistudio/develop/Knover/knover/core/task.py", line 46, in infer_step
outputs = self._post_process_infer_output(predictions)
File "/home/aistudio/develop/Knover/knover/tasks/dialog_generation.py", line 162, in _post_process_infer_output
return self._post_process_generation_output(predictions)
File "/home/aistudio/develop/Knover/knover/tasks/dialog_generation.py", line 91, in _post_process_generation_output
get_nsp_score_batch(self.nsp_predictor, predictions)
File "/home/aistudio/develop/Knover/knover/tasks/dialog_generation.py", line 404, in get_nsp_score_batch
outputs = nsp_predictor(data)
File "/home/aistudio/develop/Knover/knover/utils/inference_utils.py", line 44, in predict
return_numpy=True)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1110, in run
six.reraise(*sys.exc_info())
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/six.py", line 703, in reraise
raise value
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1108, in run
return_merged=return_merged)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1238, in _run_impl
use_program_cache=use_program_cache)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1313, in _run_program
fetch_var_name=fetch_var_name)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 624, in _add_feed_fetch_ops
if not has_feed_operators(global_block, feed, feed_var_name):
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 280, in has_feed_operators
format(feed_target_name))
Exception: 'feed_targets' does not have label_pos variable
INFO 2021-04-14 12:08:55,230 launch_utils.py:307] terminate all the procs
ERROR 2021-04-14 12:08:55,230 launch_utils.py:545] ABORT!!! Out of all 1 trainers, the trainer process with rank=[0] was aborted. Please check its log.
INFO 2021-04-14 12:08:58,233 launch_utils.py:307] terminate all the procs

exit_code=1
[[ 1 != 0 ]]
rm './output/.finish'
rm: cannot remove './output/.finish': No such file or directory
exit 1
aistudio@jupyter-208728-1765888:~/develop/Knover$

CPU版本paddleHub居然要配置CUDA_HOME什么鬼

方便透露一下数据的来源吗？

中文数据不开源我可以理解，但透露一下数据的来源没问题吧？

论文只是简单的提了一下中文数据来自中文的社交媒体，能否具体一点呢？
微博，豆瓣小组，还是百度贴吧？

不同的来源上文谈话的内容风格和话题差异还是比较大的，希望可以提供一下。
谢谢

多轮对话训练数据的组织

有关多轮对话训练数据组织，我有个疑问。
比如 a, b , c， d 是一段对话。
应该生成 a,b ,c ->d一个pair的数据，还是枚举所有上文生成下文 a b, a b ->c, abc->d 呢。

第一种方式好像会损失一些tgt，只学最后一句；
第二种方式又显得有些冗余。

NSP任务的代码应该是有问题的，是不是该修复一下了

metrics = {}

74 | fc_out = self._calc_logits(outputs["enc_out"], inputs["tgt_pos"])
75 | lm_loss = layers.softmax_with_cross_entropy(logits=fc_out, label=inputs["tgt_pos"])

models/nsp_model.py的74行应该是有问题的
参数错误，中间应该还有一个checkpoints参数
然后nsp的forward函数应该和UnifiedTransformer一样有个存checkpoints数据的操作

另外能给出plato2更具体的训练demo吗
比如给出如何先只训练UnifiedTransformer，如何后续训练nsp和隐状态那个

Undefined name do_test

Hello,

In dialog_reader found followoing error:
/Knover/readers/dialog_reader.py:374:73: F821 undefined name 'do_test'

建议把如何从check_point继续训练的方式也在文档里写一下

这个训练一般会持续很久，很可能会断了之后继续训练，所以继续训练也是个刚需。建议把如何继续训练写到文档里面。

还有就是现在要继续训练要自己在参数里填check_point路径和当前的start_step，这样还是太麻烦了，建议在保存check_point的时候把这个信息保存一下，这样继续训练的时候先检测这个信息，然后自动从上次最后的step开始训练

可以用CPU运行interact吗？

用gpu运行plato-2/scripts/24L_plato_interact.sh是可以的，但是我在没有gpu的机器上安装了cpu版本的paddlepaddle运行时出现问题：
E0811 21:28:37.228886 20545 pybind.cc:1277] Cannot use GPU because you have installed CPU version PaddlePaddle.
If you want to use GPU, please try to install GPU version PaddlePaddle by: pip install paddlepaddle-gpu
If you only have CPU, please change CUDAPlace(0) to be CPUPlace().

我已经尝试把所有的CUDAPlace(0)都替换成了CPUPlace()，请问还有哪不对吗？。
这是git diff的结果：

https://gist.github.com/fancyerii/fa04cea4e94cf9408c5d6091697fd9fa

请问作者在使用Knover做DSTC9任务时的训练脚本可以提供一下么

想通过训练脚本学习一下整个流程

为什么train.py比infer.py快的多？

在调用train.py时，batch_size可以设为8000左右，且一步用时在200s左右，而调用infer.py时，batch_size只能设的很小，4，12或更小，超过32就可能爆显存。这与平时的直观经验不一致啊。平时eval模式下应该比train模式下更快，占用内存也更小才对啊。请问是什么原因呢？

用NSP做infer的时候报错

按着文档做infer时报了下面的错误：
UnavailableError: Load operator fail to open file output/NSP/infer_model/encoder_layer_0_multi_head_att_key_fc.b_0, please check whether the model file is complete or damaged.
[Hint: Expected static_cast(fin) == true, but received static_cast(fin):0 != true:1.] (at /paddle/paddle/fluid/operators/load_op.h:41)
[operator < load > error]
是NSP模型有问题吗？

We are hiring! Come and join us!

We are looking for interns and motivated researchers & engineers in dialogue systems.
Send your resume to [email protected] if you are interested.

PLATO-2 中文模型

想请问目前有开源中文模型参数吗？谢谢！

期待更新中文模型

关于paddle的docker image问题

下载paddle的2.1.0镜像创建容器后按照步骤install了所需要的包，再运行plato-2的interact的时候会报这个错误

源代码只在interaction.py中添加了
"import paddle
paddle.enable_static()"
这俩行代码

想知道可能的原因是什么

关于plato-2论文的一个问题

最近在拜读plato-2,有个疑问

在stage1.1 粗训的时候,生成1to1的mapping,这里使用的NLL损失,可是公式前面的E指的是什么呢?

stage2.1中代指的是从z的分布中取样得出一个z,这个能理解,可是stage1.1里面的E就不太懂了

请问有没有做过数据增强

做情感分类时做过随机mask和ngram的数据增强，请问对话任务，使用这种增强方式效果会好吗？还有其他有效的数据增强方式吗？

请问use_amp中的amp是什么意思？

Low download speeds for the pretrained models

Hi, first of all, thanks for the really nice work!
I'm facing very low download speeds for the 24L model -- close to 20-30 KB/s using wget. Could you please help with an alternative mirror link? Thanks!

关于vocab字典格式错误

我用sentencepiece生成了vocab字典，model type是unigram。字典中第二列不是index整数，而是float的概率，所以运行训练时报错：ValueError: invalid literal for int() with base 10:
代码位置是vocab[token] = int(index)
因为index是浮点数，所以转换失败。
请问这里我要改字典还是代码呢？

请问NSPModel是如何训练的？

Plato model infers error!!! The same config for train process is OK, but it fails for inferrence.

aistudio@jupyter-208728-1765888:~/Knover$ git branch -av
  develop                      dcf05a0 Support PaddlePaddle 2.0.
* master                       4bad22c Fix checkpoints and add document for continuous training (#31)
  remotes/origin/HEAD          -> origin/develop
  remotes/origin/develop       dcf05a0 Support PaddlePaddle 2.0.
  remotes/origin/dygraph       5a2fbec Support dygraph in PaddlePaddle 2.0 and add lic2021 baseline
  remotes/origin/luge-dialogue 1b03ac1 update score
  remotes/origin/master        4bad22c Fix checkpoints and add document for continuous training (#31)
  remotes/origin/plato-2       4bad22c Fix checkpoints and add document for continuous training (#31)
aistudio@jupyter-208728-1765888:~/Knover$ python infer.py --model Plato --task DialogGeneration --vocab_path ./projects/lic2021/conf/vocab.txt --spm_model_file ./projects/lic2021/conf/spm.model --infer_file ./data/lic2021/test.txt --data_format numerical --file_format file --config_path ./projects/lic2021/conf/12L_P.json --init_pretraining_params Plato --batch_size 2 --max_src_len 384 --max_tgt_len 128 --max_seq_len 512 --output_name response --decoding_strategy topk_sampling --do_generation True --num_samples 4 --topk 5 --is_cn True --do_generation true --save_path ./projects/lic2021/infer/output --log_step 10 
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/setuptools/depends.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
  import imp
{
  "is_distributed": false,
  "save_path": "./projects/lic2021/infer/output",
  "infer_file": "./data/lic2021/test.txt",
  "output_name": "response",
  "log_steps": 10,
  "Model": {
    "model": "Plato",
    "config_path": "./projects/lic2021/conf/12L_P.json",
    "init_checkpoint": "",
    "init_pretraining_params": "Plato",
    "learning_rate": 1e-05,
    "warmup_steps": 0,
    "weight_decay": 0.0,
    "max_grad_norm": 0.1,
    "use_recompute": false,
    "use_amp": false,
    "amp_loss_scaling": 12800,
    "max_seq_len": 512,
    "weight_sharing": true,
    "mem_efficient": false,
    "use_bow": true,
    "use_entropy": false,
    "pre_encoder_cmd": "d",
    "preprocess_cmd": "n",
    "postprocess_cmd": "da",
    "post_cls_cmd": "n",
    "cls_bias": true,
    "attention_probs_dropout_prob": 0.1,
    "hidden_act": "gelu",
    "hidden_dropout_prob": 0.1,
    "hidden_size": 768,
    "initializer_range": 0.02,
    "max_position_embeddings": 512,
    "latent_type_size": 20,
    "num_attention_heads": 12,
    "num_hidden_layers": 12,
    "type_vocab_size": 2,
    "role_type_size": 32,
    "vocab_size": 30004
  },
  "Generator": {
    "min_dec_len": 1,
    "max_dec_len": 64,
    "decoding_strategy": "topk_sampling",
    "temperature": 1.0,
    "ignore_unk": true,
    "num_samples": 4,
    "topk": 5,
    "topp": 0.9,
    "beam_size": 10,
    "length_average": true,
    "length_penalty": 0.0
  },
  "Task": {
    "task": "DialogGeneration",
    "do_generation": true,
    "is_cn": true,
    "nsp_inference_model_path": null,
    "nsp_attention_style": "bidirectional",
    "ranking_score": "decode_score"
  },
  "Reader": {
    "max_src_len": 384,
    "max_tgt_len": 128,
    "truncate_first_turn": false,
    "file_format": "file",
    "data_format": "numerical",
    "in_tokens": false,
    "batch_size": 2,
    "continuous_position": true,
    "random_seed": 11,
    "sort_pool_size": 65536
  },
  "Tokenizer": {
    "tokenizer": "SentencePieceTokenizer",
    "vocab_path": "./projects/lic2021/conf/vocab.txt",
    "do_lower_case": false,
    "spm_model_file": "./projects/lic2021/conf/spm.model"
  },
  "run_infer": true
}
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/Knover/models/unified_transformer.py:119
The behavior of expression A + B has been unified with elementwise_add(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_add(X, Y, axis=0) instead of A + B. This transitional warning will be dropped in the future.
  op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/Knover/models/transformer_block.py:116
The behavior of expression A + B has been unified with elementwise_add(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_add(X, Y, axis=0) instead of A + B. This transitional warning will be dropped in the future.
  op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/Knover/models/transformer_block.py:217
The behavior of expression A + B has been unified with elementwise_add(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_add(X, Y, axis=0) instead of A + B. This transitional warning will be dropped in the future.
  op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/Knover/models/generator.py:161
The behavior of expression A + B has been unified with elementwise_add(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_add(X, Y, axis=0) instead of A + B. This transitional warning will be dropped in the future.
  op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:77: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
  return (isinstance(seq, collections.Sequence) and
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/Knover/models/generator.py:209
The behavior of expression A * B has been unified with elementwise_mul(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_mul(X, Y, axis=0) instead of A * B. This transitional warning will be dropped in the future.
  op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/Knover/models/generator.py:209
The behavior of expression A / B has been unified with elementwise_div(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_div(X, Y, axis=0) instead of A / B. This transitional warning will be dropped in the future.
  op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/Knover/models/generator.py:239
The behavior of expression A * B has been unified with elementwise_mul(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_mul(X, Y, axis=0) instead of A * B. This transitional warning will be dropped in the future.
  op_type, op_type, EXPRESSION_MAP[method_name]))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py:298: UserWarning: /home/aistudio/Knover/models/generator.py:239
The behavior of expression A - B has been unified with elementwise_sub(X, Y, axis=-1) from Paddle 2.0. If your code works well in the older versions but crashes in this version, try to use elementwise_sub(X, Y, axis=0) instead of A - B. This transitional warning will be dropped in the future.
  op_type, op_type, EXPRESSION_MAP[method_name]))
W0412 19:20:59.318835  4704 device_context.cc:362] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.0, Runtime API Version: 10.1
W0412 19:20:59.322726  4704 device_context.cc:372] device: 0, cuDNN Version: 7.6.
Load pretraining parameters from Plato.
Traceback (most recent call last):
  File "infer.py", line 139, in <module>
    infer(args)
  File "infer.py", line 86, in infer
    predictions = task.infer_step(model, data)
  File "/home/aistudio/Knover/tasks/task_base.py", line 43, in infer_step
    predictions = model.infer_step(inputs)
  File "/home/aistudio/Knover/models/plato.py", line 280, in infer_step
    return super(Plato, self).infer_step(inputs)
  File "/home/aistudio/Knover/models/unified_transformer.py", line 439, in infer_step
    predictions = self._run_generation(inputs)
  File "/home/aistudio/Knover/models/unified_transformer.py", line 394, in _run_generation
    return_numpy=False)
  File "/home/aistudio/Knover/models/model_base.py", line 266, in _execute
    fetch_vars = self.exe.run(program, feed, fetch_list, **kwargs)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1110, in run
    six.reraise(*sys.exc_info())
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/six.py", line 703, in reraise
    raise value
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1108, in run
    return_merged=return_merged)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1238, in _run_impl
    use_program_cache=use_program_cache)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/executor.py", line 1328, in _run_program
    [fetch_var_name])
ValueError: In user code:

    File "infer.py", line 139, in <module>
      infer(args)
    File "infer.py", line 72, in infer
      model = models.create_model(args, place)
    File "/home/aistudio/Knover/models/__init__.py", line 49, in create_model
      return MODEL_REGISTRY[args.model](args, place)
    File "/home/aistudio/Knover/models/plato.py", line 49, in __init__
      super(Plato, self).__init__(args, place)
    File "/home/aistudio/Knover/models/unified_transformer.py", line 93, in __init__
      super(UnifiedTransformer, self).__init__(args, place)
    File "/home/aistudio/Knover/models/model_base.py", line 74, in __init__
      self._build_programs()
    File "/home/aistudio/Knover/models/model_base.py", line 91, in _build_programs
      predictions = self.infer(inputs, outputs)
    File "/home/aistudio/Knover/models/unified_transformer.py", line 380, in infer
      return self.generator.inference(self, inputs, outputs)
    File "/home/aistudio/Knover/models/generator.py", line 175, in inference
      gather_idx=parent_idx)
    File "/home/aistudio/Knover/models/unified_transformer.py", line 178, in _generation_network
      gather_idx=gather_idx)
    File "/home/aistudio/Knover/models/unified_transformer.py", line 202, in _encode
      store=caches is not None
    File "/home/aistudio/Knover/models/transformer_block.py", line 376, in encoder
      store=store)
    File "/home/aistudio/Knover/models/transformer_block.py", line 288, in encoder_layer
      store=store)
    File "/home/aistudio/Knover/models/transformer_block.py", line 158, in multi_head_attention
      dropout_rate)
    File "/home/aistudio/Knover/models/transformer_block.py", line 116, in scaled_dot_product_attention
      product += attn_bias
    File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/math_op_patch.py", line 304, in __impl__
      attrs={'axis': axis})
    File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py", line 3023, in append_op
      attrs=kwargs.get("attrs", None))
    File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2107, in __init__
      for frame in traceback.extract_stack():

    InvalidArgumentError: Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [160, 12, 160, 427] and the shape of Y = [160, 12, 1, 268]. Received [427] in X is not equal to [268] in Y at i:3.
      [Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at /paddle/paddle/fluid/operators/elementwise/elementwise_op_function.h:160)
      [operator < elementwise_add > error]
aistudio@jupyter-208728-1765888:~/Knover$

运行interact.py问题

运行interact进行多伦问答时，发现第2，3...轮的回答还是针对第一轮的问题的，没有对后面的问题作回答。
请问这是为什么？
源码把所有历史信息和当前的问题连起来作为输入token_ids，并且type_ids都为0，不知道训练是不是也是这样的。

Context to a conversation

Is it possible to give a context to the conversation, so that the setting description/persona can be given to the pretrained models beforehand?

If not, is there a possibility of incorporating something like this with the pretrained models?

多伦对话的max_src_len是不是要设置长一些？

我理解的max_src_len是包括了人设，历史信息，本轮对话的上一句，这三部分加和后的最大长度，max_tgt_len是本轮对话的下一句的最大长度，max_seq_len是max_src_len+max_tgt_len的最大长度，这样理解对吗？如果对的话是不是要把max_src_len设置的长一些？中文对话训练里一个文字是2个字节，所以我设置的max_src_len=1600，这样对训练有什么影响呢？

Context to a conversation in PLATO-2

Is it possible to give a context to the conversation, so that the setting description/persona can be given to the pretrained models beforehand?

If not, is there a possibility of incorporating something like this with the pretrained models?

运行convert_data_to_numerical时需要"sentence_piece_model"

请问这个sentence_piece_model是需要自己准备的吗？另外如何使用这个模型的文档可以详细点吗，或者说可能是我自己没找到详细说明的地方。工作很好，不过作为一个新手上手有点困难欸，麻烦解答了，谢谢。

why there is no model in model file 24L/Plato

There is no model in model file 24L/Plato, so it can not be translated into onnx. While NSP have model, why?

请问有提供中文的预训练数据模型吗？

你好非常感谢工作的开源。
Paper里有提到有中文和英文模型，但似乎只在github上找到了英文的开源模型(EN) 所以中文预训练模型会开源吗请问

请问train.py中是不是只有stage2.1的训练过程？没有看到三个训练过程

请问一下论文中的L_{BOW}

根据论文：

我的理解是：topic隐变量z对于的向量h_z乘以W_2，W_2 \in R^{V \times D}为每一个词的向量，这样就可以一次计算z和所有词的内积，然后softmax变成概率。然后我们优化的目标是target里出现的词对应的logits大，从而loss小。如果是这样的话，为什么又对f_{r_t}再计算一次softmax呢？

该plato代码怎么去训练中文模型呢

NSP reader中的mask策略有时会使tgt_label采样为空，导致报错（paddle1.8版本）

你好，我在一些数据上重训nsp model，发现mask策略会使tgt_label采样为空。
具体在nsp_reader.py 的_pad_batch_records函数中
batch_mask_token_ids, tgt_label, tgt_pos, label_pos = mask(
batch_tokens=batch_token_ids,
vocab_size=self.vocab_size,
bos_id=self.bos_id,
eos_id=self.eos_id,
mask_id=self.mask_id,
sent_b_starts=batch_tgt_start_idx,
labels=batch_label,
is_unidirectional=False)

而mask策略，多次采样有时候prob 均> 0.15 ，导致mask_label、mask_pos都为空。

我在这块多次采样直到非空，暂时解决了这个问题。

the missing full source code of plato-2

Hi thanks for your great work! I explore the plato-2 directory and just found there are .sh files, may I ask where is the .py files? so I could try the chatbot interaction, thanks for your help!

有关NSPModel训练

1）我看paper中的NSPModel，“To select the most appropriate responses generated by the fine-grained generation model, the evaluation model is trained to estimate the coherence of the responses.”
理解为用stage 2.1生成的候选 + label 做分类model
而代码中的 mix_negative_sample 实现是随机替换tgt做负例，感觉不一致。
2）最后上线的模型是用2.1 先生成候选再用2.2 排序么？