Comments (3)
这个是我们训练的时候用的special token,模板是这样,所以对话需要使用这种模板
chat方案出来的编码是能对上的呀
在glm-4仓库中我们做了一个对齐apply_chat_template的版本
from chatglm3.
谢谢你的回答哈,我理解您的意思是<|user|>\n讲个故事\n<|assistant|>
是底座 chatglm3-6b-base 模型的训练模板,原始对话数据与 special token(<|user|>、<|assistant|>
) 拼接后一次性进入 tokenizer 分词后进行训练;现有的 chat 接口对多轮对话数据的语句挨个 tokenizer 分词后再与 special token id 进行 id 拼接 的处理逻辑只是为了方便多轮交互?
我纠结的点其实是多轮对话数据拼接成这样 <|user|>\n讲个故事\n<|assistant|>
一次性进行 tokenizer 并送入 genrate 接口得到的结果和 chat 接口中分别 tokenizer 再进行id 拼接得到的结果似乎不太一致,不确定我这样操作是否合理,多有打扰,实在抱歉。
另我看到了你们新上线的 THUDM/glm-4-9b-chat,这个是 chatGLM3 的迭代版本么?或者是功能相同,但底层路线有很大差异的模型才在命名上做出区别?
我也看到了 glm-4-9b-chat 里的 apply_chat_template 方法,我会先仔细研究下,谢谢您的回答和指导哈,祝心情愉悦,笑口常开,手动笔芯
from chatglm3.
是GLM3 的迭代,技术路线是相同的,
关于你提到的模板。
预训练模型不存在模板一说,模板是chat模型才有的哦。
也就是在微调的时候,如果微调chat模型,才要求严格根据模板的。
from chatglm3.
Related Issues (20)
- 请问一下,这个composite_demo中的页面demo的源码在什么地方,可以给一份嘛 HOT 1
- 输出模型注意力失败 HOT 1
- 单机双卡报错RuntimeError: Expected all tensors to be on the same device, but found at least two devices HOT 1
- ImportError: cannot import name 'split_torch_state_dict_into_shards' from 'huggingface_hub' HOT 3
- AttributeError: 'ChatGLMForConditionalGeneration' object has no attribute 'save_checkpoint' HOT 2
- chatglm3-6b-32k处理长文本时,out of memory HOT 1
- RuntimeError: Internal: could not parse ModelProto from /Users/wangjun/llm/model/chatglm3-6b/tokenizer.model
- chatglm3-6b流式请求返回结果时,第一行会自带换行符 HOT 1
- AttributeError: 'GenerationConfig' object has no attribute '_eos_token_tensor' HOT 3
- chatglm3-6b\modeling_chatglm.py", line 413, in forward ,cache_k, cache_v = kv_cache , ValueError: too many values to unpack (expected 2) HOT 4
- Intel demo AutoModelForCausalLM model.generate Wrong response by docker run the same chatglm3-int4 model bin file
- 模型无法加载 HOT 2
- requirements.txt冲突问题 HOT 1
- 启动web_demo_gradio.py进行对话时报错ValueError: too many values to unpack (expected 2) HOT 1
- chatglm3-6b-base api方式部署访问模型回答异常 HOT 1
- ModuleNotFoundError: No module named 'huggingface_hub.inference._text_generation' HOT 2
- 在ChatGLM3-6B的微调过程中,遇到如下报错(The following error occurs during the fine-tuning of ChatGLM3-6B)ImportError: cannot import name 'log' from 'torch.distributed.elastic.agent.server.api'
- The figure in execute function will be valued as None
- tools_using_demo中的openai_api_demo.py,stream等于True的时候 finish_reason 不出现"function_call" HOT 1
- 使用ptuning_v2微调过程中出现报错ValueError: Hypothesis is empty.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chatglm3.