Comments (4)
Use of evaluate in Monitoring system is an import use-case for us.
from deepeval.
@prescod the summarization metric is the only metric that does not have caching
from deepeval.
@penguine-ip Well then I guess it's bad luck that I used that test randomly to test the caching!
I find it interesting that caching is done at the test layer and not the LLM layer. I would expect you to cache LLM inputs and outputs given that that's where all of the slow stuff happens.
And then it would automatically transfer over to all tests.
from deepeval.
@prescod That's actually a good point, I think the reason why was because this was easier to implement. Also added the docs to let users know about summarizaiton doesn't support cache: https://docs.confident-ai.com/docs/metrics-summarization
from deepeval.
Related Issues (20)
- RagasMetric breaks when used with dataset HOT 5
- Suggestion for Future Improvement: Enable Function Calling for Consistent JSON Output
- No response after processing few prompts.
- ProxySchemeUnknown: Proxy URL had no scheme, should start with http:// or https://
- `check_for_update` should not happen on package load HOT 1
- Error in Faithfulness metric: KeyError: 'claims' in `a_generate_claims` HOT 3
- Selective records failure instead of Complete Job Failure HOT 2
- `ignore_errors` doesn't work as expected if `show_indicator` is set to False HOT 2
- accuracy always comes 0, might be a bug in my code i am unable to find.
- Message about nest_asyncio Should Not Be Printed HOT 2
- Add Support for `Gemini` Models HOT 6
- Conversation Evaluations
- MMLU stopped working
- Invalid value for 'top_logprobs': must be less than or equal to 5 for AzureOpenAi Models
- AuthenticationError HOT 2
- Update to tenacity 8.4.1 HOT 3
- Bug in G-Eval metrics when calculating the weighted_summation_score
- Disable Update Warnings HOT 1
- Error when generating an evaluation dataset according to the official documentation
- `GSM8KTemplate` does not yield a standard template format for 0-shot chain-of-thought prompting. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepeval.