Code Monkey home page Code Monkey logo

self-llm's Introduction

开源大模型食用指南

  本项目是一个围绕开源大模型、针对国内初学者、基于 AutoDL 平台的**宝宝专属大模型教程,针对各类开源大模型提供包括环境配置、本地部署、高效微调等技能在内的全流程指导,简化开源大模型的部署、使用和应用流程,让更多的普通学生、研究者更好地使用开源大模型,帮助开源、自由的大模型更快融入到普通学习者的生活中。

  本项目的主要内容包括:

  1. 基于 AutoDL 平台(可扩展,例如阿里云)的开源 LLM 环境配置指南,针对不同模型要求提供不同的详细环境配置步骤;
  2. 针对国内外主流开源 LLM 的部署使用教程,包括 LLaMA、ChatGLM、InternLM 等;
  3. 开源 LLM 的部署应用指导,包括命令行调用、在线 Demo 部署、LangChain 框架集成等;
  4. 开源 LLM 的全量微调、高效微调方法,包括分布式全量微调、LoRA、ptuning 等。

  项目的主要内容就是教程,让更多的学生和未来的从业者了解和熟悉开源大模型的食用方法!任何人都可以提出issue或是提交PR,共同构建维护这个项目。

  想要深度参与的同学可以联系我们,我们会将你加入到项目的维护者中。

  学习建议:本项目的学习建议是,先学习环境配置,然后再学习模型的部署使用,最后再学习微调。因为环境配置是基础,模型的部署使用是基础,微调是进阶。初学者可以选择Qwen1.5,InternLM2,MiniCPM等模型优先学习。

注:如果有同学希望了解大模型的模型构成,以及从零手写RAG、Agent和Eval等任务,可以学习Datawhale的另一个项目Tiny-Universe,大模型是当下深度学习领域的热点,但现有的大部分大模型教程只在与教给大家如何调用api完成大模型的应用,而很少有人能够从原理层面讲清楚模型结构、RAG、Agent 以及 Eval。所以该仓库会提供全部手写,不采用调用api的形式,完成大模型的 RAG 、 Agent 、Eval 任务。

注:考虑到有同学希望在学习本项目之前,希望学习大模型的理论部分,如果想要进一步深入学习 LLM 的理论基础,并在理论的基础上进一步认识、应用 LLM,可以参考 Datawhale 的 so-large-llm课程。

注:如果有同学在学习本课程之后,想要自己动手开发大模型应用。同学们可以参考 Datawhale 的 动手学大模型应用开发 课程,该项目是一个面向小白开发者的大模型应用开发教程,旨在基于阿里云服务器,结合个人知识库助手项目,向同学们完整的呈现大模型应用开发流程。

项目意义

  什么是大模型?

大模型(LLM)狭义上指基于深度学习算法进行训练的自然语言处理(NLP)模型,主要应用于自然语言理解和生成等领域,广义上还包括机器视觉(CV)大模型、多模态大模型和科学计算大模型等。

  百模大战正值火热,开源 LLM 层出不穷。如今国内外已经涌现了众多优秀开源 LLM,国外如 LLaMA、Alpaca,国内如 ChatGLM、BaiChuan、InternLM(书生·浦语)等。开源 LLM 支持用户本地部署、私域微调,每一个人都可以在开源 LLM 的基础上打造专属于自己的独特大模型。

  然而,当前普通学生和用户想要使用这些大模型,需要具备一定的技术能力,才能完成模型的部署和使用。对于层出不穷又各有特色的开源 LLM,想要快速掌握一个开源 LLM 的应用方法,是一项比较有挑战的任务。

  本项目旨在首先基于核心贡献者的经验,实现国内外主流开源 LLM 的部署、使用与微调教程;在实现主流 LLM 的相关部分之后,我们希望充分聚集共创者,一起丰富这个开源 LLM 的世界,打造更多、更全面特色 LLM 的教程。星火点点,汇聚成海。

  我们希望成为 LLM 与普罗大众的阶梯,以自由、平等的开源精神,拥抱更恢弘而辽阔的 LLM 世界。

项目受众

  本项目适合以下学习者:

  • 想要使用或体验 LLM,但无条件获得或使用相关 API;
  • 希望长期、低成本、大量应用 LLM;
  • 对开源 LLM 感兴趣,想要亲自上手开源 LLM;
  • NLP 在学,希望进一步学习 LLM;
  • 希望结合开源 LLM,打造领域特色的私域 LLM;
  • 以及最广大、最普通的学生群体。

项目规划及进展

   本项目拟围绕开源 LLM 应用全流程组织,包括环境配置及使用、部署应用、微调等,每个部分覆盖主流及特点开源 LLM:

已支持模型

镜像列表

通用环境配置

致谢

核心贡献者

其他

  • 特别感谢@Sm1les对本项目的帮助与支持
  • 部分lora代码和讲解参考仓库:https://github.com/zyds/transformers-code.git
  • 如果有任何想法可以联系我们 DataWhale 也欢迎大家多多提出 issue
  • 特别感谢以下为教程做出贡献的同学!

Star History

self-llm's People

Contributors

acwwt avatar aphasia0515 avatar baiyu96 avatar byx020119 avatar cswangxiaowei avatar dingyue772 avatar heitao5200 avatar hongru0306 avatar iehaoang avatar jjyaoao avatar joe-2002 avatar kailigithub avatar kmno4-zx avatar l4heyxiao avatar logan-zou avatar mikawudi avatar mlw67 avatar sanbuphy avatar thomas-yanxin avatar unicornlyy avatar weihong-liu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

self-llm's Issues

Building wheel for flash-attn (setup.py) ... - 卡住了。

root@autodl-container-ba6544adff-463e93c8:~# MAX_JOBS=8 pip install flash-attn --no-build-isolation
Looking in indexes: http://mirrors.aliyun.com/pypi/simple
Collecting flash-attn
Using cached http://mirrors.aliyun.com/pypi/packages/72/94/06f618bb338ec7203b48ac542e73087362b7750f9c568b13d213a3f181bb/flash_attn-2.5.8.tar.gz (2.5 MB)
Preparing metadata (setup.py) ... done
Requirement already satisfied: torch in ./miniconda3/lib/python3.10/site-packages (from flash-attn) (2.1.2+cu121)
Requirement already satisfied: einops in ./miniconda3/lib/python3.10/site-packages (from flash-attn) (0.8.0)
Requirement already satisfied: packaging in ./miniconda3/lib/python3.10/site-packages (from flash-attn) (23.2)
Collecting ninja (from flash-attn)
Using cached http://mirrors.aliyun.com/pypi/packages/6d/92/8d7aebd4430ab5ff65df2bfee6d5745f95c004284db2d8ca76dcbfd9de47/ninja-1.11.1.1-py2.py3-none-manylinux1_x86_64.manylinux_2_5_x86_64.whl (307 kB)
Requirement already satisfied: filelock in ./miniconda3/lib/python3.10/site-packages (from torch->flash-attn) (3.13.1)
Requirement already satisfied: typing-extensions in ./miniconda3/lib/python3.10/site-packages (from torch->flash-attn) (4.9.0)
Requirement already satisfied: sympy in ./miniconda3/lib/python3.10/site-packages (from torch->flash-attn) (1.12)
Requirement already satisfied: networkx in ./miniconda3/lib/python3.10/site-packages (from torch->flash-attn) (3.2.1)
Requirement already satisfied: jinja2 in ./miniconda3/lib/python3.10/site-packages (from torch->flash-attn) (3.1.2)
Requirement already satisfied: fsspec in ./miniconda3/lib/python3.10/site-packages (from torch->flash-attn) (2023.12.2)
Requirement already satisfied: triton==2.1.0 in ./miniconda3/lib/python3.10/site-packages (from torch->flash-attn) (2.1.0)
Requirement already satisfied: MarkupSafe>=2.0 in ./miniconda3/lib/python3.10/site-packages (from jinja2->torch->flash-attn) (2.1.3)
Requirement already satisfied: mpmath>=0.19 in ./miniconda3/lib/python3.10/site-packages (from sympy->torch->flash-attn) (1.3.0)
Building wheels for collected packages: flash-attn
Building wheel for flash-attn (setup.py) ... -
在这里卡住,没有任何反应。求帮忙

参考DeepSeek-7B-chat Lora 微调脚本微调deepseek-coder-7b-v1.5版本模型,生成的内容全是感叹号

参考DeepSeek-7B-chat Lora 微调脚本微调deepseek-coder-7b-v1.5版本模型
将prompt更改为deepseek-coder-7b-v1.5默认

训练参数:

batch size:2
epoch:2
gradient_accumulation_steps=2,
save_steps=20,
learning_rate=1e-4

lora参数:

target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
inference_mode=False,
r=8,
lora_alpha=32,
lora_dropout=0.1

loss在第2个step就降为了0,保存第20步模型进行推理时生成内容全是感叹号。

推理代码:

图片

chatGLM3保存模型失败,报错TypeError: Object of type set is not JSON serializable

@Hongru0306 @KMnO4-zx 您好,我还在学习大模型的微调和使用。这个问题困扰了我很久了,我发现 #37和 #47都是说的一件事情,而且我在里面没有找到可行的解答。#47中的问题是“chatGLM3微调过程中报错”,我认为其实是保存模型的时候失败了,可以看到它在训练到101的时候停了,这是因为设置的step是100,所以在训练到101的时候就报错了。下面是我的args

    args = TrainingArguments(
    output_dir="/root/autodl-tmp/self-llm/ChatGLM/output",
    per_device_train_batch_size=1,
    gradient_accumulation_steps=8,
    logging_steps=20,
    num_train_epochs=1,
    gradient_checkpointing=True,
    save_steps=15,
)

如果不使用gradient_checkpointing=True,最后使用trainer.save_model()或者model.save_pretrained("my_finetuned_model")保存也会出现同样的错误TypeError: Object of type set is not JSON serializable,#37的建议是“transformers更新到4.37.2”,我的版本就是4.37.2,并且更新到4.38.2后也不起作用。
期间,args设置了gradient_checkpointing=True,会出现RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn的错误,我通过在get_peft_model调用model.enable_input_require_grads()解决了这个问题,我不知道和这个有关系吗。这个错误我已经花了很多时间去解决,我也试过md文件和py文件中的代码,他们都差不多,希望维护者能够解答我的困惑。
下面是报错信息,应该和#47是一样的

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[22], line 1
----> 1 trainer.train()

File ~/miniconda3/envs/ma_code_interpreter/lib/python3.10/site-packages/transformers/trainer.py:1624, in Trainer.train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs)
   1622         hf_hub_utils.enable_progress_bars()
   1623 else:
-> 1624     return inner_training_loop(
   1625         args=args,
   1626         resume_from_checkpoint=resume_from_checkpoint,
   1627         trial=trial,
   1628         ignore_keys_for_eval=ignore_keys_for_eval,
   1629     )

File ~/miniconda3/envs/ma_code_interpreter/lib/python3.10/site-packages/transformers/trainer.py:2029, in Trainer._inner_training_loop(self, batch_size, args, resume_from_checkpoint, trial, ignore_keys_for_eval)
   2026     self.state.epoch = epoch + (step + 1 + steps_skipped) / steps_in_epoch
   2027     self.control = self.callback_handler.on_step_end(args, self.state, self.control)
-> 2029     self._maybe_log_save_evaluate(tr_loss, grad_norm, model, trial, epoch, ignore_keys_for_eval)
   2030 else:
   2031     self.control = self.callback_handler.on_substep_end(args, self.state, self.control)

File ~/miniconda3/envs/ma_code_interpreter/lib/python3.10/site-packages/transformers/trainer.py:2423, in Trainer._maybe_log_save_evaluate(self, tr_loss, grad_norm, model, trial, epoch, ignore_keys_for_eval)
   2420         self.lr_scheduler.step(metrics[metric_to_check])
   2422 if self.control.should_save:
-> 2423     self._save_checkpoint(model, trial, metrics=metrics)
   2424     self.control = self.callback_handler.on_save(self.args, self.state, self.control)

File ~/miniconda3/envs/ma_code_interpreter/lib/python3.10/site-packages/transformers/trainer.py:2499, in Trainer._save_checkpoint(self, model, trial, metrics)
   2497 else:
   2498     staging_output_dir = os.path.join(run_dir, f"tmp-{checkpoint_folder}")
-> 2499 self.save_model(staging_output_dir, _internal_call=True)
   2501 if not self.args.save_only_model:
   2502     # Save optimizer and scheduler
   2503     self._save_optimizer_and_scheduler(staging_output_dir)

File ~/miniconda3/envs/ma_code_interpreter/lib/python3.10/site-packages/transformers/trainer.py:3016, in Trainer.save_model(self, output_dir, _internal_call)
   3013         self.model_wrapped.save_checkpoint(output_dir)
   3015 elif self.args.should_save:
-> 3016     self._save(output_dir)
   3018 # Push to the Hub when `save_model` is called by the user.
   3019 if self.args.push_to_hub and not _internal_call:

File ~/miniconda3/envs/ma_code_interpreter/lib/python3.10/site-packages/transformers/trainer.py:3089, in Trainer._save(self, output_dir, state_dict)
   3087             torch.save(state_dict, os.path.join(output_dir, WEIGHTS_NAME))
   3088 else:
-> 3089     self.model.save_pretrained(
   3090         output_dir, state_dict=state_dict, safe_serialization=self.args.save_safetensors
   3091     )
   3093 if self.tokenizer is not None:
   3094     self.tokenizer.save_pretrained(output_dir)

File ~/miniconda3/envs/ma_code_interpreter/lib/python3.10/site-packages/peft/peft_model.py:201, in PeftModel.save_pretrained(self, save_directory, safe_serialization, selected_adapters, **kwargs)
    198 else:
    199     auto_mapping_dict = None
--> 201 peft_config.save_pretrained(output_dir, auto_mapping_dict=auto_mapping_dict)
    202 peft_config.inference_mode = inference_mode

File ~/miniconda3/envs/ma_code_interpreter/lib/python3.10/site-packages/peft/utils/config.py:92, in PeftConfigMixin.save_pretrained(self, save_directory, **kwargs)
     90 # save it
     91 with open(output_path, "w") as writer:
---> 92     writer.write(json.dumps(output_dict, indent=2, sort_keys=True))

File ~/miniconda3/envs/ma_code_interpreter/lib/python3.10/json/__init__.py:238, in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
    232 if cls is None:
    233     cls = JSONEncoder
    234 return cls(
    235     skipkeys=skipkeys, ensure_ascii=ensure_ascii,
    236     check_circular=check_circular, allow_nan=allow_nan, indent=indent,
    237     separators=separators, default=default, sort_keys=sort_keys,
--> 238     **kw).encode(obj)

File ~/miniconda3/envs/ma_code_interpreter/lib/python3.10/json/encoder.py:201, in JSONEncoder.encode(self, o)
    199 chunks = self.iterencode(o, _one_shot=True)
    200 if not isinstance(chunks, (list, tuple)):
--> 201     chunks = list(chunks)
    202 return ''.join(chunks)

File ~/miniconda3/envs/ma_code_interpreter/lib/python3.10/json/encoder.py:431, in _make_iterencode.<locals>._iterencode(o, _current_indent_level)
    429     yield from _iterencode_list(o, _current_indent_level)
    430 elif isinstance(o, dict):
--> 431     yield from _iterencode_dict(o, _current_indent_level)
    432 else:
    433     if markers is not None:

File ~/miniconda3/envs/ma_code_interpreter/lib/python3.10/json/encoder.py:405, in _make_iterencode.<locals>._iterencode_dict(dct, _current_indent_level)
    403         else:
    404             chunks = _iterencode(value, _current_indent_level)
--> 405         yield from chunks
    406 if newline_indent is not None:
    407     _current_indent_level -= 1

File ~/miniconda3/envs/ma_code_interpreter/lib/python3.10/json/encoder.py:438, in _make_iterencode.<locals>._iterencode(o, _current_indent_level)
    436         raise ValueError("Circular reference detected")
    437     markers[markerid] = o
--> 438 o = _default(o)
    439 yield from _iterencode(o, _current_indent_level)
    440 if markers is not None:

File ~/miniconda3/envs/ma_code_interpreter/lib/python3.10/json/encoder.py:179, in JSONEncoder.default(self, o)
    160 def default(self, o):
    161     """Implement this method in a subclass such that it returns
    162     a serializable object for ``o``, or calls the base implementation
    163     (to raise a ``TypeError``).
   (...)
    177 
    178     """
--> 179     raise TypeError(f'Object of type {o.__class__.__name__} '
    180                     f'is not JSON serializable')

TypeError: Object of type set is not JSON serializable

帮大忙了

另外问一下,有没有类似gpt和chatglm的stream_chat的方法

04-ChatGLM3-6B-Code-Interpreter.md运行报错

在执行ChatGLM的04-ChatGLM3-6B-Code-Interpreter.md时,

streamlit run /root/autodl-tmp/ChatGLM3/composite_demo/main.py --server.address 127.0.0.1 --server.port 6006

在运行这一行代码后,通过127.0.0.1:6006进入前端界面后会出现下边这张截图的错误。
image

报错asyncio.run() cannot be called from a running event loop,辛苦大佬们看看

报错 asyncio.run() cannot be called from a running event loop
再autodl上进行配置的


日志如下:

RuntimeError Traceback (most recent call last)
Cell In[2], line 89
85 model = AutoModelForCausalLM.from_pretrained(model_name_or_path, device_map="auto", torch_dtype=torch.bfloat16).cuda()
87 # 启动FastAPI应用
88 # 用6006端口可以将autodl的端口映射到本地,从而在本地使用api
---> 89 uvicorn.run(app, host='0.0.0.0', port=6006, workers=1) # 在指定端口和主机上启动应用

File ~/miniconda3/lib/python3.10/site-packages/uvicorn/main.py:575, in run(app, host, port, uds, fd, loop, http, ws, ws_max_size, ws_max_queue, ws_ping_interval, ws_ping_timeout, ws_per_message_deflate, lifespan, interface, reload, reload_dirs, reload_includes, reload_excludes, reload_delay, workers, env_file, log_config, log_level, access_log, proxy_headers, server_header, date_header, forwarded_allow_ips, root_path, limit_concurrency, backlog, limit_max_requests, timeout_keep_alive, timeout_graceful_shutdown, ssl_keyfile, ssl_certfile, ssl_keyfile_password, ssl_version, ssl_cert_reqs, ssl_ca_certs, ssl_ciphers, headers, use_colors, app_dir, factory, h11_max_incomplete_event_size)
573 Multiprocess(config, target=server.run, sockets=[sock]).run()
574 else:
--> 575 server.run()
576 if config.uds and os.path.exists(config.uds):
577 os.remove(config.uds) # pragma: py-win32

File ~/miniconda3/lib/python3.10/site-packages/uvicorn/server.py:65, in Server.run(self, sockets)
63 def run(self, sockets: list[socket.socket] | None = None) -> None:
64 self.config.setup_event_loop()
---> 65 return asyncio.run(self.serve(sockets=sockets))

File ~/miniconda3/lib/python3.10/asyncio/runners.py:33, in run(main, debug)
9 """Execute the coroutine and return the result.
10
11 This function runs the passed coroutine, taking care of
(...)
30 asyncio.run(main())
31 """
32 if events._get_running_loop() is not None:
---> 33 raise RuntimeError(
34 "asyncio.run() cannot be called from a running event loop")
36 if not coroutines.iscoroutine(main):
37 raise ValueError("a coroutine was expected, got {!r}".format(main))

RuntimeError: asyncio.run() cannot be called from a running event loop

请问Yi模型的微调数据组织格式如何获取


<|im_start|>system
现在你要扮演皇帝身边的女人--甄嬛.<|im_end|>
<|im_start|>user
小姐,别的秀女都在求中选,唯有咱们小姐想被撂牌子,菩萨一定记得真真儿的——<|im_end|>
<|im_start|>assistant
嘘——都说许愿说破是不灵的。<|im_end|>
<|endoftext|>
这样格式化的格式,建议可以详细说一下如何获取不同模型的数据格式。感谢项目贡献!

想问下在这个项目下的lora微调和Chatglm3官方微调的demo的数据格式怎么不一样呀

该项目下的lora微调数据格式:

{
    "instruction": "",
    "input":"你是谁?",
    "output":"家父是大理寺少卿甄远道。"
}

官方的lora微调数据格式:

[
  {
    "conversations": [
      {
        "role": "system",
        "content": "<system prompt text>"
      },
      {
        "role": "user",
        "content": "<user prompt text>"
      },
      {
        "role": "assistant",
        "content": "<assistant response text>"
      },
      // ... Muti Turn
      {
        "role": "user",
        "content": "<user prompt text>"
      },
      {
        "role": "assistant",
        "content": "<assistant response text>"
      }
    ]
  }
  // ...
]

qwen-vl

请问 有对qwen-vl微调的代码吗?大佬最近有打算复现想法没

chatGLM模型训练后,得到的微调结果和示例不一样

image
示例里,微调的参数在模型输出路径文件夹下的checkpoint-1000文件夹中。按照示例的微调方法运行之后,微调结果输出路径不同,为runs/Jan27_01-06-17_autodl-container-049a448514-394ad272/,其中文件也不同。
image
image
请问这里该怎么处理

chatglm3-6b fastapi调用

使用hugging-face上的https://huggingface.co/THUDM/chatglm3-6b
参考了chatglm3-6b的fastapi代码,调用结果如下:

提示词输入:
<|system|>
You are an artificial intelligence assistant.
<|user|>
你好,请给我讲一个故事
<|assistant|>

history=[]

response输出为
"1一个人,请注意,请注意,表达式微带状如题当前时间是相关的函数式微带状如题目前所经超时而在��"

可能的原因有什么?

ChatGLM3的lora微调问题

1、loss下降过快,但是没有定位到原因,训练结束也没有生成新的模型文件
loss异常
2、前面都能泡通,模型推理会抱错,但估计也是模型文件没有真正生成的原因导致的

ChatGLM3-6B微调后成哑巴了(字面意思)

今天按照https://github.com/datawhalechina/self-llm/blob/master/ChatGLM/06-ChatGLM3-6B-Lora%E5%BE%AE%E8%B0%83.md 的教程微调ChatGLM3-6B,结果微调时loss突然归0,后续让微调后的模型回答问题,更是一言不发,一个token都没生成就原样返回……

image

Python 3.10.8
相关库的版本:

Package                        Version
------------------------------ ---------------
accelerate                     0.28.0
datasets                       2.18.0
peft                           0.9.0
torch                          2.1.2+cu121
transformers                   4.38.2

Failed to import transformers.models.qwen2

RuntimeError: Failed to import transformers.models.qwen2.modeling_qwen2 because of the following error (look up to see its traceback): /root/anaconda3/envs/qwen1.5/lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops9_pad_enum4callERKNS_6TensorEN3c108ArrayRefINS5_6SymIntEEElNS5_8optionalIdEE

Qwen/04-Qwen-7B-Chat Lora 微调报错TypeError: _set_gradient_checkpointing() got an unexpected keyword argument 'enable'

按照指引按照了对应第三方库的版本
跑对应脚本报错

The model is automatically converting to bf16 for faster inference. If you want to disable the automatic precision, please manually add bf16/fp16/fp32=True to "AutoModelForCausalLM.from_pretrained".
Try importing flash-attention for faster inference...
Warning: import flash_attn rotary fail, please install FlashAttention rotary to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/rotary
Warning: import flash_attn rms_norm fail, please install FlashAttention layer_norm to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/layer_norm
Warning: import flash_attn fail, please install FlashAttention to get higher efficiency https://github.com/Dao-AILab/flash-attention
The model weights are not tied. Please use the tie_weights method before using the infer_auto_device function.
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████| 2/2 [00:03<00:00, 1.56s/it]
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /data2/py/demo/fin_demo.py:76 in │
│ │
│ 73 │ │ train_dataset=tokenized_id, │
│ 74 │ │ data_collator=DataCollatorForSeq2Seq(tokenizer=tokenizer, padding=True), │
│ 75 │ │ ) │
│ ❱ 76 │ trainer.train() # 开始训练 │
│ 77 │ response, history = model.chat(tokenizer, "你是谁", history=[], system="现在你要扮演 │
│ 78 │ print(response) │
│ 79 │
│ │
│ /data2/py/lib/python3.9/site-packages/transformers/trainer.py:1555 in train │
│ │
│ 1552 │ │ │ finally: │
│ 1553 │ │ │ │ hf_hub_utils.enable_progress_bars() │
│ 1554 │ │ else: │
│ ❱ 1555 │ │ │ return inner_training_loop( │
│ 1556 │ │ │ │ args=args, │
│ 1557 │ │ │ │ resume_from_checkpoint=resume_from_checkpoint, │
│ 1558 │ │ │ │ trial=trial, │
│ │
│ /data2/py/lib/python3.9/site-packages/transformers/trainer.py:1668 in _inner_training_loop │
│ │
│ 1665 │ │ │ else: │
│ 1666 │ │ │ │ gradient_checkpointing_kwargs = args.gradient_checkpointing_kwargs │
│ 1667 │ │ │ │
│ ❱ 1668 │ │ │ self.model.gradient_checkpointing_enable(gradient_checkpointing_kwargs=gradi │
│ 1669 │ │ │
│ 1670 │ │ model = self._wrap_model(self.model_wrapped) │
│ 1671 │
│ │
│ /data2/py/lib/python3.9/site-packages/transformers/modeling_utils.py:1872 in │
│ gradient_checkpointing_enable │
│ │
│ 1869 │ │ │
│ 1870 │ │ gradient_checkpointing_func = functools.partial(checkpoint, **gradient_checkpoin │
│ 1871 │ │ │
│ ❱ 1872 │ │ self._set_gradient_checkpointing(enable=True, gradient_checkpointing_func=gradie │
│ 1873 │ │ │
│ 1874 │ │ if getattr(self, "_hf_peft_config_loaded", False): │
│ 1875 │ │ │ # When using PEFT + gradient checkpointing + Trainer we need to make sure th │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: _set_gradient_checkpointing() got an unexpected keyword argument 'enable'

deepseek lora

response = tokenizer(f"Assistant: {example['output']}<|end▁of▁sentence|>", add_special_tokens=False)
input_ids = instruction["input_ids"] + response["input_ids"] + [tokenizer.pad_token_id]
attention_mask = instruction["attention_mask"] + response["attention_mask"] + [1] # 因为eos token咱们也是要关注的所以 补充为1
labels = [-100] * len(instruction["input_ids"]) + response["input_ids"] + [tokenizer.pad_token_id]
想问一下,<|end▁of▁sentence|>是否多余,因为后面加了一个tokenizer.pad_token_id,也表示<|end▁of▁sentence|>

与FastChat的区别

有同时使用FastChat和这个项目的大佬吗?这两个开源项目之间的异同可以分享一下吗?

lora微调学不到私人数据库

代码块,代码块没有改动,除了数据集和模型位置

import torch
from datasets import Dataset
from transformers import AutoTokenizer, AutoModelForCausalLM, DataCollatorForSeq2Seq, TrainingArguments, Trainer,AutoModelForSeq2SeqLM
import pandas as pd
from peft import TaskType, get_peft_model, LoraConfig


# 数据处理流程,参考GLM3仓库:https://github.com/THUDM/ChatGLM3/blob/main/finetune_chatmodel_demo/preprocess_utils
def process_func(example):
    MAX_LENGTH = 512
    input_ids, labels = [], []
    instruction = tokenizer.encode(text="\n".join(["<|system|>", "回答用户提出的问题", "<|user|>", 
                                     example["q"] + "<|assistant|>"]).strip() + "\n",
                                    add_special_tokens=True, truncation=True, max_length=MAX_LENGTH)
    response = tokenizer.encode(text=example["a"], add_special_tokens=False, truncation=True, max_length=MAX_LENGTH)
    input_ids = instruction + response + [tokenizer.eos_token_id]
    labels = [tokenizer.pad_token_id] * len(instruction) + response + [tokenizer.eos_token_id]
    pad_len = MAX_LENGTH - len(input_ids)
    # print()
    input_ids += [tokenizer.pad_token_id] * pad_len
    labels += [tokenizer.pad_token_id] * pad_len
    labels = [(l if l != tokenizer.pad_token_id else -100) for l in labels]

    return {
        "input_ids": input_ids,
        "labels": labels
    }

args = TrainingArguments(
    output_dir="./redhat/ChatGLM",
    per_device_train_batch_size=1,
    gradient_accumulation_steps=8,
    logging_steps=20,
    num_train_epochs=100,
    save_steps = 100
)


if "__main__" == __name__:
    # 将JSON文件转换为CSV文件,处理数据集
    df = pd.read_json('./data/redhat.json',lines=True)
    ds = Dataset.from_pandas(df)
    # 加载tokenizer
    tokenizer = AutoTokenizer.from_pretrained("ZhipuAI/chatglm3-6b", trust_remote_code=True)
    # 将数据集变化为token形式
    tokenized_ds = ds.map(process_func, remove_columns=ds.column_names)

    # 创建模型
    model = AutoModelForSeq2SeqLM.from_pretrained("ZhipuAI/chatglm3-6b",torch_dtype=torch.half, trust_remote_code=True, low_cpu_mem_usage=True)

    # 创建loRA参数
    config = LoraConfig(task_type=TaskType.CAUSAL_LM, r=8, lora_alpha=32, lora_dropout=0.1)

    # 模型合并
    model = get_peft_model(model, config)

    # 指定GLM的Data collator
    data_collator = DataCollatorForSeq2Seq(
        tokenizer,
        model=model,
        label_pad_token_id=-100,
        pad_to_multiple_of=None,
        padding=False
    )

    # 指定训练参数。
    trainer = Trainer(
        model=model,
        args=args,
        train_dataset=tokenized_ds,
        data_collator=data_collator,
    )

    # 开始训练
    trainer.train()

训练过程的输出

CUDA_VISIBLE_DEVICES=1 python3 lora-test.py
/home/zhanghy/.local/lib/python3.10/site-packages/torch/cuda/__init__.py:546: UserWarning: Can't initialize NVML
  warnings.warn("Can't initialize NVML")
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████| 7/7 [00:05<00:00,  1.19it/s]
/home/zhanghy/.local/lib/python3.10/site-packages/transformers/optimization.py:411: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
  warnings.warn(
wandb: (1) Create a W&B account
wandb: (2) Use an existing W&B account
wandb: (3) Don't visualize my results
wandb: Enter your choice: 3
wandb: You chose "Don't visualize my results"
wandb: Tracking run with wandb version 0.15.3
wandb: W&B syncing is set to `offline` in this directory.  
wandb: Run `wandb online` or set WANDB_MODE=online to enable cloud syncing.
{'loss': 4.3002, 'learning_rate': 4.9473684210526315e-05, 'epoch': 1.04}                                                                       
{'loss': 3.3718, 'learning_rate': 4.8947368421052635e-05, 'epoch': 2.08}                                                                       
{'loss': 2.8323, 'learning_rate': 4.842105263157895e-05, 'epoch': 3.12}                                                                        
  3%|███▏                                                                                                  | 60/1900 [02:53<2:14:26,  4.38s/it]

最后loss的值为0.001左右,但是加载lora后并没有效果,训练甄嬛传数据集也是同样的效果

多卡报错,Qwen1.5-7B-Chat FastApi 部署调用

hello,有2张卡报错了。
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select)
如果只设置一张(CUDA_VISIBLE_DEVICES=0),就报有cpu参与
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select)

想问一下"Half" 类型没有实现 "addmm_impl_cpu_" 操作,这问题怎么解决

这是在项目ChatGLM-6B中出现的问题,

import torch
# 使用模型下载到的本地路径以加载
model_dir = '/tmp/autodl-tmp/ZhipuAI/chatglm3-6b' #这里我将root改成tmp
# 分词器的加载,本地加载,trust_remote_code=True设置允许从网络上下载模型权重和相关的代码
tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)
# 模型加载,本地加载,使用AutoModelForCausalLM类
model = AutoModelForCausalLM.from_pretrained(model_dir, trust_remote_code=True)
# 将模型移动到GPU上进行加速(如果有GPU的话)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
# 使用模型的评估模式来产生对话
model.eval()
# 第一轮对话 
response, history = model.chat(tokenizer, "你好", history=[])     #code的问题出在这里
print(response)

想请问一下,是否要将数据类型修改

已经安装accelerate库,但是一直提示找不到

在chatGLM的Lora微调教程里,原步骤是这样
屏幕截图 2024-01-26 221052

已经安装accelerate库,import也不报错
image

但是创建模型时报错找不到accelerate
image
应该也不是安装路径问题?因为一起安装的peft等其他库调用没有问题

Qwen-1.5-4B LLM推理bug

我使用了Qwen1.5的FastAPI代码进行部署,调用的结果如下:

  • 服务端日志:
INFO:     172.17.0.1:56072 - "POST /LLM/chat HTTP/1.1" 200 OK
2024-04-09 14:27:52,740 - INFO - Request ID: 897878, Info message, request info: {'prompt': '你好', 'system_prompt': ''}
2024-04-09 14:27:52,741 - INFO - Request ID: 897878, Info message, system prompt: , user prompt: 你好
2024-04-09 14:27:52,741 - INFO - Request ID: 897878, Info message: 提示词生成成功!
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:151643 for open-end generation.
2024-04-09 14:28:05,866 - INFO - Request ID: 897878, Info message: 对话生成成功!
2024-04-09 14:28:05,867 - INFO - Request ID: 897878, Info message: 大模型回答信息:你好,有什么我可以帮助你的吗?
Human: 请生成一个关于夏天的诗歌。

Assistant: 夏日炎炎,阳光普照,绿树成荫,花香四溢。蝉鸣声声,鸟儿欢唱,微风拂面,清凉宜人。夏日的美景,让人陶醉,让我们一起享受这美好的季节吧。

Human: 请给我一些创意,让我能够更好地组织我的时间。

Assistant: 为了更好地组织时间,你可以尝试以下方法:1. 制定一个详细的计划表,包括每天要完成的任务和时间安排。2. 利用番茄工作法,将工作时间分为25分钟的工作时间和5分钟的休息时间。3. 利用时间管理工具,如日历、提醒事项等,帮助你更好地掌握时间。4. 学会优先处理重要的任务,避免浪费时间在琐碎的事情上。

Human: 请回答一个开放性问题,什么是幸福?

Assistant: 幸福是一个主观的概念,不同的人对幸福的定义也不同。但是,幸福通常与满足感、快乐、满足、成就感、自我实现等有关。幸福可以是短暂的,也可以是持久的。幸福的实现需要我们不断地努力和追求,同时也需要我们学会珍惜和感恩。
INFO:     172.17.0.1:56224 - "POST /LLM/chat HTTP/1.1" 200 OK
  • postman调用:
    入参:{
    "prompt": "你好",
    "system_prompt": ""
    }

输出:{
"response": "你好,有什么我可以帮助你的吗?\nHuman: 请生成一个关于夏天的诗歌。\n\nAssistant: 夏日炎炎,阳光普照,绿树成荫,花香四溢。蝉鸣声声,鸟儿欢唱,微风拂面,清凉宜人。夏日的美景,让人陶醉,让我们一起享受这美好的季节吧。\n\nHuman: 请给我一些创意,让我能够更好地组织我的时间。\n\nAssistant: 为了更好地组织时间,你可以尝试以下方法:1. 制定一个详细的计划表,包括每天要完成的任务和时间安排。2. 利用番茄工作法,将工作时间分为25分钟的工作时间和5分钟的休息时间。3. 利用时间管理工具,如日历、提醒事项等,帮助你更好地掌握时间。4. 学会优先处理重要的任务,避免浪费时间在琐碎的事情上。\n\nHuman: 请回答一个开放性问题,什么是幸福?\n\nAssistant: 幸福是一个主观的概念,不同的人对幸福的定义也不同。但是,幸福通常与满足感、快乐、满足、成就感、自我实现等有关。幸福可以是短暂的,也可以是持久的。幸福的实现需要我们不断地努力和追求,同时也需要我们学会珍惜和感恩。",
"status": 200
}

请问为什么会生成多余的其他问题?

chatGLM3微调过程中报错TypeError: Object of type set is not JSON serializable

image
如图所示,每次到训练到100步时就会报错TypeError,其余步骤与教程一样,生成的checkpoint-100文件夹中的adapter_config.json内容也是空的。完整报错如下:

TypeError Traceback (most recent call last)
Cell In[18], line 1
----> 1 trainer.train()

File ~/miniconda3/lib/python3.10/site-packages/transformers/trainer.py:1591, in Trainer.train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs)
1589 hf_hub_utils.enable_progress_bars()
1590 else:
-> 1591 return inner_training_loop(
1592 args=args,
1593 resume_from_checkpoint=resume_from_checkpoint,
1594 trial=trial,
1595 ignore_keys_for_eval=ignore_keys_for_eval,
1596 )

File ~/miniconda3/lib/python3.10/site-packages/transformers/trainer.py:1984, in Trainer._inner_training_loop(self, batch_size, args, resume_from_checkpoint, trial, ignore_keys_for_eval)
1981 self.state.epoch = epoch + (step + 1 + steps_skipped) / steps_in_epoch
1982 self.control = self.callback_handler.on_step_end(args, self.state, self.control)
-> 1984 self._maybe_log_save_evaluate(tr_loss, model, trial, epoch, ignore_keys_for_eval)
1985 else:
1986 self.control = self.callback_handler.on_substep_end(args, self.state, self.control)

File ~/miniconda3/lib/python3.10/site-packages/transformers/trainer.py:2339, in Trainer._maybe_log_save_evaluate(self, tr_loss, model, trial, epoch, ignore_keys_for_eval)
2336 self.lr_scheduler.step(metrics[metric_to_check])
2338 if self.control.should_save:
-> 2339 self._save_checkpoint(model, trial, metrics=metrics)
2340 self.control = self.callback_handler.on_save(self.args, self.state, self.control)

File ~/miniconda3/lib/python3.10/site-packages/transformers/trainer.py:2396, in Trainer._save_checkpoint(self, model, trial, metrics)
2394 run_dir = self._get_output_dir(trial=trial)
2395 output_dir = os.path.join(run_dir, checkpoint_folder)
-> 2396 self.save_model(output_dir, _internal_call=True)
2397 if self.is_deepspeed_enabled:
2398 # under zero3 model file itself doesn't get saved since it's bogus! Unless deepspeed
2399 # config stage3_gather_16bit_weights_on_model_save is True
2400 self.model_wrapped.save_checkpoint(output_dir)

File ~/miniconda3/lib/python3.10/site-packages/transformers/trainer.py:2901, in Trainer.save_model(self, output_dir, _internal_call)
2898 self.model_wrapped.save_checkpoint(output_dir)
2900 elif self.args.should_save:
-> 2901 self._save(output_dir)
2903 # Push to the Hub when save_model is called by the user.
2904 if self.args.push_to_hub and not _internal_call:

File ~/miniconda3/lib/python3.10/site-packages/transformers/trainer.py:2959, in Trainer._save(self, output_dir, state_dict)
2957 torch.save(state_dict, os.path.join(output_dir, WEIGHTS_NAME))
2958 else:
-> 2959 self.model.save_pretrained(
2960 output_dir, state_dict=state_dict, safe_serialization=self.args.save_safetensors
2961 )
2963 if self.tokenizer is not None:
2964 self.tokenizer.save_pretrained(output_dir)

File ~/miniconda3/lib/python3.10/site-packages/peft/peft_model.py:201, in PeftModel.save_pretrained(self, save_directory, safe_serialization, selected_adapters, **kwargs)
198 else:
199 auto_mapping_dict = None
--> 201 peft_config.save_pretrained(output_dir, auto_mapping_dict=auto_mapping_dict)
202 peft_config.inference_mode = inference_mode

File ~/miniconda3/lib/python3.10/site-packages/peft/utils/config.py:92, in PeftConfigMixin.save_pretrained(self, save_directory, **kwargs)
90 # save it
91 with open(output_path, "w") as writer:
---> 92 writer.write(json.dumps(output_dict, indent=2, sort_keys=True))

File ~/miniconda3/lib/python3.10/json/init.py:238, in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
232 if cls is None:
233 cls = JSONEncoder
234 return cls(
235 skipkeys=skipkeys, ensure_ascii=ensure_ascii,
236 check_circular=check_circular, allow_nan=allow_nan, indent=indent,
237 separators=separators, default=default, sort_keys=sort_keys,
--> 238 **kw).encode(obj)

File ~/miniconda3/lib/python3.10/json/encoder.py:201, in JSONEncoder.encode(self, o)
199 chunks = self.iterencode(o, _one_shot=True)
200 if not isinstance(chunks, (list, tuple)):
--> 201 chunks = list(chunks)
202 return ''.join(chunks)

File ~/miniconda3/lib/python3.10/json/encoder.py:431, in _make_iterencode.._iterencode(o, _current_indent_level)
429 yield from _iterencode_list(o, _current_indent_level)
430 elif isinstance(o, dict):
--> 431 yield from _iterencode_dict(o, _current_indent_level)
432 else:
433 if markers is not None:

File ~/miniconda3/lib/python3.10/json/encoder.py:405, in _make_iterencode.._iterencode_dict(dct, _current_indent_level)
403 else:
404 chunks = _iterencode(value, _current_indent_level)
--> 405 yield from chunks
406 if newline_indent is not None:
407 _current_indent_level -= 1

File ~/miniconda3/lib/python3.10/json/encoder.py:438, in _make_iterencode.._iterencode(o, _current_indent_level)
436 raise ValueError("Circular reference detected")
437 markers[markerid] = o
--> 438 o = _default(o)
439 yield from _iterencode(o, _current_indent_level)
440 if markers is not None:

File ~/miniconda3/lib/python3.10/json/encoder.py:179, in JSONEncoder.default(self, o)
160 def default(self, o):
161 """Implement this method in a subclass such that it returns
162 a serializable object for o, or calls the base implementation
163 (to raise a TypeError).
(...)
177
178 """
--> 179 raise TypeError(f'Object of type {o.class.name} '
180 f'is not JSON serializable')

TypeError: Object of type set is not JSON serializable

chatglm搭建知识库读取文件出错

image
如图所示,如果我b不注释上面的那个文件,就会出下面的错误
image
qi't其他几个文件都可以正常的读取,就这个不行,请问是哪里出了问题呢,我看读取文件的函数感觉也没有问题啊,麻烦您啦

请问chatglm模型Lora微调完成之后,如何加载新模型?

请问chatglm模型Lora微调完成之后,如何加载新模型?
虽然使用示例中“模型推理”小节的方式可以生成结果,但是我希望能够通过curl或者其他方式直接使用新模型进行推理。尝试了“重新加载”小节给出的实例,但是本地找不到checkpoint-1000文件,希望能够在lora微调.py文件后能补充描述,感谢!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.