Comments (27)
推荐一手我们团队开发的微调工具库:XTuner
目前已经支持了 ChatGLM3-6B-Base 的微调;同时,在数据集处理逻辑上,我们也进行了精心设计,方便拓展自定义数据。
一键启动
ChatGLM3-6B-Base, QLoRA, open assistant 数据集(显存占用 11GB 左右)
pip install xtuner==0.1.6
xtuner train chatglm3_6b_base_qlora_oasst1_e3
from chatglm3.
LLaMA-Factory is all you need: https://github.com/hiyouga/LLaMA-Factory
from chatglm3.
@LZHgrla thanks , and i've finally launched up my QLora fine tune .
from chatglm3.
+1
from chatglm3.
+1支持
from chatglm3.
+1
from chatglm3.
chatGLM2的微调代码适用不?很好奇都是同一个系列的模型,为什么微调代码不能共用呀?
from chatglm3.
from chatglm3.
想问下大佬,想用多轮对话数据训练chatglm3,应该怎么组织数据呀?魔搭这个文档我没看明白要怎么组织。。。
from chatglm3.
chatGLM2的微调代码适用不?很好奇都是同一个系列的模型,为什么微调代码不能共用呀?
输入格式不一样
from chatglm3.
https://github.com/xxw1995/chatglm3-finetune
from chatglm3.
好东西, mark 一下
from chatglm3.
@WangRongsheng does LLaMA-Factory support GLM2-6b and using Qlora SFT . in several steps ?
from chatglm3.
@LZHgrla how to use xtuner in command line to train my custom dataset , mode is QLora . any guide doc link ?
from chatglm3.
@LZHgrla how to use xtuner in command line to train my custom dataset , mode is QLora . any guide doc link ?
Single-turn conversation Docs: zh_cn, en
Multi-turn conversation Docs: zh_cn, en
from chatglm3.
@WangRongsheng does LLaMA-Factory support GLM2-6b and using Qlora SFT . in several steps ?
Yes, it can do.
from chatglm3.
@LZHgrla following single turn conversation doc guide : i got this error: NotImplementedError: Loading a dataset cached in a LocalFileSystem is not supported.
any ideas ?? @LZHgrla
leo@leo-System-Product-Name:~/Downloads/mvp/work_dirs$ xtuner -v
10/29 20:58:18 - mmengine - INFO - 0.1.6
from chatglm3.
@LZHgrla following single turn conversation doc guide : i got this error: NotImplementedError: Loading a dataset cached in a LocalFileSystem is not supported. any ideas ?? @LZHgrla leo@leo-System-Product-Name:~/Downloads/mvp/work_dirs$ xtuner -v 10/29 20:58:18 - mmengine - INFO - 0.1.6
You can try pip install -U datasets
If you have further questions, please post them on here
from chatglm3.
marked
from chatglm3.
marked
from chatglm3.
推荐一手我们团队开发的微调工具库:XTuner 目前已经支持了 ChatGLM3-6B-Base 的微调;同时,在数据集处理逻辑上,我们也进行了精心设计,方便拓展自定义数据。
一键启动
ChatGLM3-6B-Base, QLoRA, open assistant 数据集(显存占用 11GB 左右)
pip install xtuner==0.1.6 xtuner train chatglm3_6b_base_qlora_oasst1_e3
使用xtuner train 微调chatglm3后 无法生成 adapter_config.json 导致qlora训练后的权重无法使用@LZHgrla
from chatglm3.
关心这个问题,谢谢
+1
from chatglm3.
https://github.com/minghaochen/chatglm3-base-tuning
chatglm3发布了,这次还发了base版本的模型,意味着我们可以基于这个base模型去自由地做SFT了。本项目实现了基于base模型的多轮对话SFT。
from chatglm3.
关心这个问题,谢谢
+1
from chatglm3.
推荐一手我们团队开发的微调工具库:XTuner 目前已经支持了 ChatGLM3-6B-Base 的微调;同时,在数据集处理逻辑上,我们也进行了精心设计,方便拓展自定义数据。
一键启动
ChatGLM3-6B-Base, QLoRA, open assistant 数据集(显存占用 11GB 左右)
pip install xtuner==0.1.6 xtuner train chatglm3_6b_base_qlora_oasst1_e3
使用xtuner train 微调chatglm3后 无法生成 adapter_config.json 导致qlora训练后的权重无法使用@LZHgrla
我们这边测试并不会出现这个问题,训练后经过转换可以直接获得qlora的adapter权重
from chatglm3.
微调代码什么时候能够发布?
from chatglm3.
微调代码已发布,请参考 ChatGLM3-6B 微调示例。
from chatglm3.
Related Issues (20)
- 第一次尝试 github 中 lora_finetune.ipynb 微调示例,未执行成功,报以下错误,请问如何解决? HOT 1
- 请问一下,这个composite_demo中的页面demo的源码在什么地方,可以给一份嘛 HOT 1
- 输出模型注意力失败 HOT 1
- 单机双卡报错RuntimeError: Expected all tensors to be on the same device, but found at least two devices HOT 1
- ImportError: cannot import name 'split_torch_state_dict_into_shards' from 'huggingface_hub' HOT 3
- AttributeError: 'ChatGLMForConditionalGeneration' object has no attribute 'save_checkpoint' HOT 2
- chatglm3-6b-32k处理长文本时,out of memory HOT 1
- RuntimeError: Internal: could not parse ModelProto from /Users/wangjun/llm/model/chatglm3-6b/tokenizer.model
- chatglm3-6b流式请求返回结果时,第一行会自带换行符 HOT 1
- AttributeError: 'GenerationConfig' object has no attribute '_eos_token_tensor' HOT 3
- chatglm3-6b\modeling_chatglm.py", line 413, in forward ,cache_k, cache_v = kv_cache , ValueError: too many values to unpack (expected 2) HOT 4
- Intel demo AutoModelForCausalLM model.generate Wrong response by docker run the same chatglm3-int4 model bin file
- 模型无法加载 HOT 2
- requirements.txt冲突问题 HOT 1
- 启动web_demo_gradio.py进行对话时报错ValueError: too many values to unpack (expected 2) HOT 1
- chatglm3-6b-base api方式部署访问模型回答异常 HOT 1
- ModuleNotFoundError: No module named 'huggingface_hub.inference._text_generation' HOT 2
- 在ChatGLM3-6B的微调过程中,遇到如下报错(The following error occurs during the fine-tuning of ChatGLM3-6B)ImportError: cannot import name 'log' from 'torch.distributed.elastic.agent.server.api'
- The figure in execute function will be valued as None
- tools_using_demo中的openai_api_demo.py,stream等于True的时候 finish_reason 不出现"function_call" HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chatglm3.