Comments (8)
Thanks, very helpful! It looks like Claude v2 may be returning a different format from expected. The client-side error occurs because the score
value is null
.
I'll handle the client side error so it doesn't crash the page, but this will also require an update to the Bedrock provider code.
from promptfoo.
One potentially temporary workaround is to add a transform
directive to your test cases to unpack the nested completion
value in the Claude response. That would look something like this:
defaultTest:
options:
transform: JSON.parse(output).completion
from promptfoo.
from promptfoo.
Are you able to share the contents of ~/.promptfoo/output/latest.json
? That will help me troubleshoot, as I don't have access to the claude-v2 model on bedrock.
from promptfoo.
Sure!
latest.json
from promptfoo.
Thanks a lot @typpo!
from promptfoo.
Unfortunately, the transform didn't work.
│ [FAIL] SyntaxError: Unexpected token o in JSON │ [FAIL] SyntaxError: Unexpected token o in JSON │
│ │ at position 1 │ at position 1 │
│ │ │ │
│ │ SyntaxError: Unexpected token o in JSON at position 1 │ SyntaxError: Unexpected token o in JSON at position 1 │
│ │ m │ m │
│ │ at JSON.parse (<anonymous>) │ at JSON.parse (<anonymous>) │
│ │ at eval (eval at transformOutput (/Users/sean.hellebu │ at eval (eval at transformOutput (/Users/sean.hellebu │
│ │ sch/.npm/_npx/81bbc6515d992ace/node_modules/promp... │ sch/.npm/_npx/81bbc6515d992ace/node_modules/promp...
from promptfoo.
Sorry, it's been a while and I still don't have access to Claude 2.1 on Bedrock. However, I think I noticed the issue in code (a030796). The fix should be out in 0.43.1, and as a temporarily workaround I believe you can disable the cache (promptfoo eval --no-cache
)
from promptfoo.
Related Issues (20)
- CLI docs request: default behavior of `eval --output` HOT 1
- Support `systemInstruction` for Gemini (PALM) HOT 4
- Allow options to avoid using special characters HOT 4
- How can I set threshold for avg. of test cases scores with csv? HOT 2
- Types for `promptfoo.evaluate` broken when using TS
- Add ability to bust cache from the web UI
- When Python provider raises exception, details are no longer recorded HOT 3
- Ensure percent complete accurately reflects test suite
- Migrations path incorrect in self host docker build HOT 4
- [Web UI] Image previewer not working for variables or failed tests
- CI passes despite failing build HOT 1
- python external assertion not working HOT 2
- [Web UI] Increase robustness of markdown rendering
- Specify a label with a prompt function HOT 3
- --no confirmation for cli view HOT 1
- expression to select subset of output to display in view HOT 3
- SqliteError: no such table: evals HOT 1
- Error: Unknown Amazon Bedrock model: meta.llama3-70b-instruct-v1:0 HOT 4
- Compare providers based on latency HOT 3
- Failed to fetch when attempting to log in/sign up on the web-UI after promptfoo share
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from promptfoo.