I think there might be an error in calculating mean log probability when using GPT-3. The main issue is that GPT-3 does not only return generated texts in response, it returns more than these (including token_logprobs
of logprobs
). Therefore, in order to calculate the mean log probability, we cannot simply use
# calculate mean log prob across tokens
mean_log_probs = [np.mean(response['choices'][i]['logprobs']['token_logprobs']) for i in range(sampling_params['n'])]
Instead, we should stop counting when a stop token is met.
For example, here is a response with a stop sequence of "\n". The generated text is "Walk to kitchen", however GPT-3 returns more than that,
response: {
"choices": [
{
"finish_reason": "stop",
"index": 0,
"logprobs": {
"text_offset": [
317,
322,
325,
333,
333,
333,
333,
333
],
"token_logprobs": [
-0.2976162,
-0.00012346054,
-0.5069456,
-0.0011470452,
-0.0060894582,
-0.00028055036,
-6.838237e-05,
-0.054386232
],
"tokens": [
" Walk",
" to",
" kitchen",
"\n",
"Step",
" 2",
":",
" Walk"
],
"top_logprobs": [
{
" Get": -3.9821253,
" Go": -3.5860093,
" Make": -3.1428235,
" Wake": -2.513738,
" Walk": -0.2976162
},
{
" To": -12.335158,
" in": -11.411637,
" into": -9.384543,
" to": -0.00012346054,
" upstairs": -12.2138815
},
{
" bedroom": -5.3587174,
" dining": -1.0860167,
" kitchen": -0.5069456,
" living": -4.34434,
" the": -3.2986841
},
{
"\n": -0.0011470452,
" ": -7.6692185,
" table": -9.372099,
".": -8.122213,
"ette": -9.167303
},
{
"\n": -5.1904135,
" Step": -7.8304586,
"Step": -0.0060894582,
"Task": -9.905375,
"step": -10.6300955
},
{
" 1": -10.295448,
" 2": -0.00028055036,
" 3": -11.589857,
" 4": -12.77457,
"2": -8.387781
},
{
"\n": -11.062581,
" :": -11.94543,
",": -12.268325,
".": -10.367215,
":": -6.838237e-05
},
{
" Find": -3.783928,
" Open": -4.0909195,
" Turn": -5.903181,
" Walk": -0.054386232,
"Walk": -5.14835
}
]
},
"text": " Walk to kitchen"
}
],
"model": "text-davinci-001",
"object": "text_completion",
"usage": {
"completion_tokens": 3,
"prompt_tokens": 94,
"total_tokens": 97
}
}
Please let me know what you think. Great work!