Code Monkey home page Code Monkey logo

aimonlabs / aimon-rely Goto Github PK

View Code? Open in Web Editor NEW
7.0 1.0 2.0 1.16 MB

AIMon Rely is a state-of-the-art system consisting of multiple models for detecting LLM quality issues during offline evaluations and continuous production monitoring. We offer various model quality metrics that are fast, reliable and cost-effective.

Home Page: https://aimon.ai

License: MIT License

Python 100.00%
evaluation generative-ai continuous-monitoring hallucination-detection llm

aimon-rely's Introduction

πŸŽ‰Welcome to AIMon Rely

AIMon Rely helps developers build, ship, and monitor LLM Apps more confidently and reliably with its state-of-the-art, multi-model system for detecting LLM quality issues. It helps seamlessly with both offline evaluations and continuous production monitoring. AIMon Rely offers fast, reliable, and cost-effective hallucination detection. It also supports other important quality metrics such as completeness, conciseness, and toxicity. Read our blog post for more details.

✨ Join our community on Slack

AIMon Rely

Metrics Supported

The following is a list of quality metrics that are currently available and on our roadmap. Please reach out to express your interest in any of these.

Metric Status
Model Hallucination (Passage and Sentence Level) βœ“
Completeness βœ“
Conciseness βœ“
Toxicity βœ“
Semantic Similarity βŒ›
Sentiment βŒ›
Coherence βŒ›
Sensitive Data (PII/PHI/PCI) βŒ›

Getting Started

You can use AIMon either through an SDK or through an API. Below you will find instructions on 1. how to use the SDK along with AIMon's UI and 2. how you could use it directly through our REST APIs.

1. SDK and UI

AIMon supports asynchronous instrumentation or synchronous detections for the metrics mentioned above. Use these steps to get started with using the AIMon SDK and the product.

  • Step 1: Get access to the beta product by joining the waitlist on our website or by requesting it on Slack or sending an email to [email protected]
  • Step 2: Install the AIMon SDK by running pip install aimon in your terminal.
  • Step 3: For an example of how to instrument an LLM application asynchronously using the SDK, please refer to the sample notebook
  • Step 4: For an example of synchronous detections using the SDK, please refer to the sample streamlit application
AIMon Product

2. API

AIMon detections can be accessed via REST APIs. Here are the steps to access the API:

Sandbox

You can try our Sandbox that is available on our website to try our Hallucination detection models quickly.

Benchmarks

Hallucination Detection

To demonstrate the effectiveness of our system, we benchmarked it against popular industry benchmarks for the hallucination detection task. The table below shows our results.

A few key takeaways:

βœ… AIMon Rely is 10x cheaper than GPT-4 Turbo.

βœ… AIMon Rely is 4x faster than GPT-4 Turbo.

βœ… AIMon Rely provides the convenience of a fully hosted API that includes baked-in explainability.

βœ… Support for a context length of up to 32,000 tokens (with plans to further expand this in the near future).

Overall, AIMon Rely is 10 times cheaper, 4 times faster, and close to or even better than GPT-4 on the benchmarks making it a suitable choice for both offline and online detection of hallucinations.

Hallucination Benchmarks

Completeness, Conciseness Detection

There is a lack of industry-standard benchmark datasets for these metrics. We will be publishing an evaluation dataset soon. Stay Tuned! βŒ›

Pricing

Please reach out to [email protected] for pricing details related to the product and the API.

Future Work

  • We are working on additional metrics as detailed in the table above.
  • In addition, we are working on something awesome to make the offline evaluation and continuous model quality monitoring experience more seamless.

Join our Slack for the latest updates and discussions on generative AI reliability.

aimon-rely's People

Contributors

aimonp avatar pjoshi30 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.