Code Monkey home page Code Monkey logo

chinese-chatbot's Introduction

Chinese-ChatBot/中文聊天机器人

  • 作者已经全面转到GNN图神经网络方向C++开发,不再跟进NLP方面,项目代码停止维护。遥想项目完成时,网上资源甚少,作者一时兴起初次接触NLP与Deep Learning,克服重重困难,终于写出这个Toy Model。因此,作者深知小白不易,所以即使项目不再维护,但issue或邮件([email protected])一定会及时答复,以帮助Deep Learning新人。(Tensorflow我用的版本太老了,新的版本直接运行肯定各种报错,如果你遇到困难也别可以费劲去装老版的环境了,建议用Pytorch参照我的处理逻辑重构一遍,我懒得写了)
  • GNN方面:
  • 改编整理了一套基准对比模型:GNNs-Baseline ,以方便快速验证idea。
  • 本人的论文ACMMM 2023 (CCF-A) 开源代码在这LSTGM
  • 本人的论文ICDM 2023 (CCF-B) 开源代码还在整理中。。。GRN
  • 欢迎同好补充、交流、学习。

环境配置

程序 版本
python 3.68
tensorflow 1.13.1
Keras 2.2.4
windows10
jupyter

主要参考资料

关键点

  • LSTM
  • seq2seq
  • attention 实验表明加入attention机制后训练速度快,收敛快,效果更好。

语料及训练环境

青云语料库10万组对话,在google colaboratory训练。

运行

方式一:完整过程

  • 数据预处理
    get_data
  • 模型训练
    chatbot_train(此为挂载到google colab版本,本地跑对路径等需略加修改)
  • 模型预测
    chatbot_inference_Attention

方式二:加载现有模型

  • 运行chatbot_inference_Attention
  • 加载models/W--184-0.5949-.h5

界面(Tkinter)

Attention权重可视化

其他

  • 训练文件chat_bot中,最后三块代码前两个是挂载谷歌云盘用的,最后一个是获取那些loss方便画图,不知道为什么回调函数里的tensorbord不好使,故出此下策;
  • 预测文件里倒数第二块代码只有文字输入没界面,最后一块代码是界面,根据需求两块跑其一即刻;
  • 代码中有很多中间输出,希望对你理解代码提供了些许帮助;
  • models里面有一个我训练好的模型,正常运行应该是没有问题的,你也可以自己训练
  • 作者能力有限,并未找到量化对话效果的指标,因此loss只能大致反映训练进度。

chinese-chatbot's People

Contributors

jayeew avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

chinese-chatbot's Issues

very good

赞一个,有些作者留个代码也不调试就不管了,一堆报错最后还不能正常对话。有这么简便的程序爱了

beam search 结果超出词表索引

您好,最近在跑您训练好的模型时,发现用beam search 搜索时,结果会超出词表索引范围,想请问下您是否遇到过类似情况。
snipaste_20200703_155554
snipaste_20200703_155611

还想请问您,如果自己训练模型,大概需要训练几个epoch

Cannot load weight

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-5-f17e176e67ed> in <module>
    103     epoch_list = get_file_list(main_path + 'models/')
    104     epoch_last = epoch_list[-1]
--> 105     model.load_weights(main_path + 'models/' + epoch_last)
    106     print("**********checkpoint_loaded: ", epoch_last)
    107     initial_epoch_ = int(epoch_last.split('-')[2]) - 1

c:\program files (x86)\microsoft visual studio\shared\python37_64\lib\site-packages\tensorflow\python\keras\engine\training.py in load_weights(self, filepath, by_name, skip_mismatch)
    248         raise ValueError('Load weights is not yet supported with TPUStrategy '
    249                          'with steps_per_run greater than 1.')
--> 250     return super(Model, self).load_weights(filepath, by_name, skip_mismatch)
    251 
    252   def compile(self,

c:\program files (x86)\microsoft visual studio\shared\python37_64\lib\site-packages\tensorflow\python\keras\engine\network.py in load_weights(self, filepath, by_name, skip_mismatch)
   1264             f, self.layers, skip_mismatch=skip_mismatch)
   1265       else:
-> 1266         hdf5_format.load_weights_from_hdf5_group(f, self.layers)
   1267 
   1268   def _updated_config(self):

c:\program files (x86)\microsoft visual studio\shared\python37_64\lib\site-packages\tensorflow\python\keras\saving\hdf5_format.py in load_weights_from_hdf5_group(f, layers)
    705                        str(len(weight_values)) + ' elements.')
    706     weight_value_tuples += zip(symbolic_weights, weight_values)
--> 707   K.batch_set_value(weight_value_tuples)
    708 
    709 

c:\program files (x86)\microsoft visual studio\shared\python37_64\lib\site-packages\tensorflow\python\keras\backend.py in batch_set_value(tuples)
   3382   if ops.executing_eagerly_outside_functions():
   3383     for x, value in tuples:
-> 3384       x.assign(np.asarray(value, dtype=dtype(x)))
   3385   else:
   3386     with get_graph().as_default():

c:\program files (x86)\microsoft visual studio\shared\python37_64\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py in assign(self, value, use_locking, name, read_value)
    844     with _handle_graph(self.handle):
    845       value_tensor = ops.convert_to_tensor(value, dtype=self.dtype)
--> 846       self._shape.assert_is_compatible_with(value_tensor.shape)
    847       assign_op = gen_resource_variable_ops.assign_variable_op(
    848           self.handle, value_tensor, name=name)

c:\program files (x86)\microsoft visual studio\shared\python37_64\lib\site-packages\tensorflow\python\framework\tensor_shape.py in assert_is_compatible_with(self, other)
   1115     """
   1116     if not self.is_compatible_with(other):
-> 1117       raise ValueError("Shapes %s and %s are incompatible" % (self, other))
   1118 
   1119   def most_specific_compatible_shape(self, other):

ValueError: Shapes (43360, 100) and (43350, 100) are incompatible

作者数据处理之后把索引全加一了,有什么作用吗?

for i, j in word_to_index.items():
word_to_index[i] = j + 1

index_to_word = {}
for key, value in word_to_index.items():
index_to_word[value] = key
pad_question = question
pad_answer_a = answer_a
pad_answer_b = answer_b

for pos, i in enumerate(pad_question):
for pos_, j in enumerate(i):
i[pos_] = j + 1
if(len(i) > answer_maxLen):
pad_question[pos] = i[:answer_maxLen]

预测序列全部都是PAD

你好,我参考您的代码实现了一遍,但是我预测不管是哪句话,结果都是PAD,请问怎么解决

PAPER

你好..有看到註解...# equation (6) of the paper
想請問參考哪一篇paper呢

运行到加载.h5时发生了错误,我想知道该怎么解决

错误信息
Traceback (most recent call last):
File "E:/360MoveData/Users/Administrator/Desktop/CHAT2/chatbot_inference_Attention.py", line 119, in
model.load_weights('models/W--184-0.5949-.h5')
ValueError: Dimension 0 in both shapes must be equal, but are 3 and 43350. Shapes are [3,100] and [43350,100]. for 'Assign' (op: 'Assign') with input shapes: [3,100], [43350,100].

无法运行

Traceback (most recent call last):
File "chatbot_inference_Attention.ipynb", line 372, in
"scrolled": true
NameError: name 'true' is not defined

train by self

你好,順利跑完成你的code
我想用自己找到的語料庫去訓練資料同樣10萬筆,get_data前製作業處理好了,
但是train的時候一直報錯,可以請問一下是為什麼嘛?

AlreadyExistsError: Resource __per_step_3/RMSprop/gradients/decoder_lstm/while/ReadVariableOp/Enter_grad/ArithmeticOptimizer/AddOpsRewrite_Add/tmp_var/N10tensorflow19TemporaryVariableOp6TmpVarE
[[{{node RMSprop/gradients/decoder_lstm/while/ReadVariableOp/Enter_grad/ArithmeticOptimizer/AddOpsRewrite_Add/tmp_var}}]]

对新版库不兼容情况

虽然自己修修能搞定,但还是希望可以给新手一点小帮助,然后是在chatbot_train.ipynb里面有modles的错误拼写

误关了个issues

我的版本和你的一样然后运行
python chatbot_inference_Attention.ipynb
显示
Traceback (most recent call last):
File "chatbot_inference_Attention.ipynb", line 372, in
"scrolled": true
NameError: name 'true' is not defined

项目问题

你好,没有main.py和START.bat文件呀!!

package requirement

您好,請問可否提供此package所需要的套件和版本(requirement)

gpu

用gpu训练显存爆满怎么解决,这个怎么改批次大小??

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.