Code Monkey home page Code Monkey logo

Comments (6)

skepsun avatar skepsun commented on June 11, 2024 1

我用deepspeed-chat官方代码跑的时候,如果不加ptx loss,会有一定概率导致actor训着训着输出就变成空的,加了以后(而且它默认的ptx_coef非常大)就不会有这个问题,而且效果也还可以。分两次backward的做法我也在它的代码里看到了,我先试一下引入看能不能跑通吧。

from openrlhf.

hijkzzz avatar hijkzzz commented on June 11, 2024

这个BUG是由 DeepSpeed 造成的,他们一直没有修复
microsoft/DeepSpeed#627
一种trick的解决方法是 ptx loss 和 ppo loss 分两次backward(有兴趣可以尝试下 提交 MR 需要注意的点是 两次 backward 会破坏 deepspeed 的 accumulated 次数计算,所以又要加一个trick补丁)
我暂时没有修这个的原因是 由于一般大家没有预训练数据 很少会用 ptx

from openrlhf.

hijkzzz avatar hijkzzz commented on June 11, 2024

其实就是 gradient checkpoint 和 两条路径的backward 冲突了,如果你模型小的话 关掉gradient checkpoint 也能跑

from openrlhf.

skepsun avatar skepsun commented on June 11, 2024

改了就可以跑了:

self.strategy.backward(self.ptx_coef * ptx_loss, self.actor, self.actor_optim)
# actor_loss = ptx_loss * self.ptx_coef + raw_actor_loss

额外的trick补丁是指什么呢

from openrlhf.

hijkzzz avatar hijkzzz commented on June 11, 2024

已经修正

from openrlhf.

skepsun avatar skepsun commented on June 11, 2024

ptx_coef好像没用上

from openrlhf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.