psunlpgroup / realmistake Goto Github PK
View Code? Open in Web Editor NEWThis repository includes a benchmark and code for the paper "Evaluating LLMs at Detecting Errors in LLM Responses".
Home Page: https://arxiv.org/abs/2404.03602
License: Other