Hi, I encountered some problem about pad_t

Also <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Thanks for raising this. Let's take a look. Maybe <a class="user-mention notranslate"

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Yes <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Yes <a class="user-mention notranslate" data-hovercard-type="user" data-h

`pad_token_id` issue about reward-bench HOT 7 CLOSED

hank0316 commented on June 9, 2024

`pad_token_id` issue

from reward-bench.

Comments (7)

natolambert commented on June 9, 2024 1

Also @hank0316 -- if you want to open a PR with that solution we can test it further!

from reward-bench.

natolambert commented on June 9, 2024 1

@hank0316 nope, not to my knowledge. Most use tokenizers implementation.

from reward-bench.

natolambert commented on June 9, 2024

Thanks for raising this. Let's take a look. Maybe @ValentinaPy (mentioned some interested in continuing on this project)

from reward-bench.

hank0316 commented on June 9, 2024

@natolambert , I have opened the PR. Would you kindly review it? Additionally, I apologize for inadvertently pressing the close button; I am not familiar with adding comments to an issue.

from reward-bench.

hank0316 commented on June 9, 2024

Hey @natolambert, I have a question about training a reward model. Do you think it's necessary to incorporate chat templates during data preprocessing for RM training? And if yes, should the template align with those used in SFT?

from reward-bench.

natolambert commented on June 9, 2024

Yes @hank0316 chat templates are important. I think there can be slightly differences (e.g. for RM you aren't generating after iirc), but it should match at a high level.

from reward-bench.

hank0316 commented on June 9, 2024

Yes @hank0316 chat templates are important. I think there can be slightly differences (e.g. for RM you aren't generating after iirc), but it should match at a high level.

Sure, @natolambert! I appreciate your response and this fantastic benchmark. I'm also curious if any models on the leaderboard employ the tulu chat template for evaluation but utilize their own chat template in SFT. In other words, the template used by that model in SFT differs from the one utilized in this benchmark.

from reward-bench.

Recommend Projects

`pad_token_id` issue about reward-bench HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent