Comments (4)
We only referred to your directory file structure, and we developed everything else ourselves (Especially core RL technology and performance optimization). I don't think the directory structure and a few lines of comments have any real contribution.
from openrlhf.
Hello, our code is developed based on deepspeed and ray, while you used colossalai for development. Obviously it is completely impossible for us to be using the same technology stack.
Our framework uses step-wise RL, whereas yours uses single-step RL.
The core of a framework is performance optimization and hyperparameter tuning & PPO implementation details, not the directory structures/comments.
Finally, your colossalchat code also includes contributions from us, including PPO tuning and ray.
from openrlhf.
Colossalchat Team contacted me earlier about RL issues
The wechat history proves that colossalchat's core RL technology was supported by me.
Our developers also contributed the ray components to colossalaichat
https://github.com/hpcaitech/ColossalAI/pull/3309/files
from openrlhf.
@binmakeswell
Thank you for the discussion.
Let's keep in touch in the future and work together to improve the RLHF ecosystem.
from openrlhf.
Related Issues (20)
- PPO采用zero 3 stage后产生time out error HOT 2
- maybe data bug with dpo trainer HOT 1
- QLORA model loading error HOT 5
- 我们正在对比DSchat跟OpenRLHF的性能以便完成选型工作,能否提供下修复后的DSChat代码,从而复现社区提供的性能对比数据 HOT 7
- Avoid monkey patching vLLM HOT 1
- Claim your paper on HF HOT 1
- [Question] EOS in reward model dataset HOT 3
- action_log_probs重复计算 HOT 2
- Support Llama-3 models HOT 1
- Incompatibility with Qwen HOT 2
- Suggestion on the configurations HOT 1
- Strange Kill of Critic Model HOT 5
- 使用Deepseek-lite训练DPO,显示expected mat1 and mat2 to have the same type, but got: float != c10: : BFLoat16 HOT 2
- Will 2 x GPU setups be supported HOT 1
- Dummy token for prompts in HH datasets HOT 2
- Does this codebase consider using "torch.compile"? HOT 2
- wrong action_log_probs returned? HOT 1
- 可以增加支持SimPO吗
- zero3 training error HOT 1
- Failed to update weights to vLLM HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from openrlhf.