Code Monkey home page Code Monkey logo

Comments (6)

greyovo avatar greyovo commented on August 25, 2024

output-CoDeX/checkpoint-80000/trainer_state.json 中,最后几行内容是:

{
  [
    {
      "epoch": 2.92,
      "learning_rate": 8.248803594695632e-06,
      "loss": 0.0629,
      "step": 79990
    },
    {
      "epoch": 2.92,
      "learning_rate": 8.212272135020273e-06,
      "loss": 0.0627,
      "step": 80000
    }
  ],
  "max_steps": 82221,
  "num_train_epochs": 3,
  "total_flos": 0.0,
  "trial_name": null,
  "trial_params": null
}

from kopa.

Zhang-Each avatar Zhang-Each commented on August 25, 2024

这个有可能是peft版本的问题,你可以检查一下保存下来的lora权重里面是不是一个空文件。peft库和transformers版本不匹配的时候就会出现这种问题。

from kopa.

Zhang-Each avatar Zhang-Each commented on August 25, 2024

一般来说只要用数据训练过,模型肯定是不会出现zero-shot水平的性能的,很有可能就是peft库版本的问题,可以参考peft库的这个issue
huggingface/peft#286

from kopa.

greyovo avatar greyovo commented on August 25, 2024

谢谢,我去试一下。

from kopa.

gaoIf avatar gaoIf commented on August 25, 2024

麻烦问下,这个问题有解决吗。我查看了保存的lora权重确实是空的有问题
我的peft库是requirement.txt里的0.3.0
尝试了注释

old_state_dict = model.state_dict

# model.state_dict = (
#     lambda self, *_, **__: get_peft_model_state_dict(
#         self, old_state_dict()
#     )
# ).__get__(model, type(model))

但lora权重保存又太大
是需要升级0.4.0吗

from kopa.

Zhang-Each avatar Zhang-Each commented on August 25, 2024

麻烦问下,这个问题有解决吗。我查看了保存的lora权重确实是空的有问题 我的peft库是requirement.txt里的0.3.0 尝试了注释

old_state_dict = model.state_dict

# model.state_dict = (
#     lambda self, *_, **__: get_peft_model_state_dict(
#         self, old_state_dict()
#     )
# ).__get__(model, type(model))

但lora权重保存又太大 是需要升级0.4.0吗

这个是peft库的问题,可以参考上面的链接里的方案尝试解决

from kopa.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.