Code Monkey home page Code Monkey logo

Comments (7)

enbacoo avatar enbacoo commented on September 4, 2024 1

是的,我们提供的HF格式的权重也是与原版权重的差值。需要你先下载LLaMA HF版本的权重,然后执行合并脚本。你可以转换成HF版本后再合并,然后再重新试一下推理是不是还有问题。

关于prompt我们没有使用特殊的格式,直接提问就可以,如果你想知道我们评测阅读理解时用的prompt可以去config文件里找一下对应数据集的class,prompt默认值被写在了class里。

我使用了llama hf的权重,使用llama-panda-zh-coig-7b-delta进行合并,报错'LlamaForCausalLM' object has no attribute 'embed_tokens',在脚本昨天提交的fix处报错,transformers版本是最新的

from pandallm.

SparkJiao avatar SparkJiao commented on September 4, 2024 1

Hi, all

问题已经被修复了,现在测试过可以成功合并了,之前在embed_tokens前少了一个.model的前缀。请各位同学重新pull一下最新的修改再做尝试。如果推理测试的效果不好,可以把一些bad case另外开一个issue发上来。

感谢各位同学的关注!

from pandallm.

SparkJiao avatar SparkJiao commented on September 4, 2024

是的,我们提供的HF格式的权重也是与原版权重的差值。需要你先下载LLaMA HF版本的权重,然后执行合并脚本。你可以转换成HF版本后再合并,然后再重新试一下推理是不是还有问题。

关于prompt我们没有使用特殊的格式,直接提问就可以,如果你想知道我们评测阅读理解时用的prompt可以去config文件里找一下对应数据集的class,prompt默认值被写在了class里。

from pandallm.

xxxxuee avatar xxxxuee commented on September 4, 2024

我使用hf格式的llama合并也报了一样的错

from pandallm.

enbacoo avatar enbacoo commented on September 4, 2024

Hi, all

问题已经被修复了,现在测试过可以成功合并了,之前在embed_tokens前少了一个.model的前缀。请各位同学重新pull一下最新的修改再做尝试。如果推理测试的效果不好,可以把一些bad case另外开一个issue发上来。

感谢各位同学的关注!

修改完的确可以合并了,但之前有个问题又出了,size mismatch for model.embed_tokens.weight: copying a param with shape torch.Size([32001, 4096]) from
checkpoint, the shape in current model is torch.Size([32000, 4096]).
size mismatch for lm_head.weight: copying a param with shape torch.Size([32001, 4096]) from checkpoint, the
shape in current model is torch.Size([32000, 4096]).我看说之前已经fix了。

from pandallm.

SparkJiao avatar SparkJiao commented on September 4, 2024

现在可以了 可以再试试 我load过模型和跑过inference了都没什么问题了

Thanks!

from pandallm.

enbacoo avatar enbacoo commented on September 4, 2024

现在可以了 可以再试试 我load过模型和跑过inference了都没什么问题了

Thanks!

好嘞,可以了

from pandallm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.