Code Monkey home page Code Monkey logo

max-ai's Introduction

MaxAI

MaxAI

MaxAI is our trusty PostHog support AI deployed on our Slack, app, and website.

MaxAI was born in Aruba at a PostHog team offsite for a hackathon on a warm spring day in 2023.

How it works

How Max works is surprisingly simple.

Tooling

  • Weaviate - Vector database that allows us to pull relevant context to embed in our prompts to GPT
  • Haystack by deepset - Allows us to hook together pipelines of these tools to service user prompts
  • OpenAI - Provides us the base language model in gpt-3.5-turbo that we augment to create our AI

Embedding time

flowchart TD
    A[Github]
    B[Docs]
    C[Squeak]
    A -->|Calculate Embed Vectors|D[Weaviate]
    B -->|Calculate Embed Vectors|D
    C -->|Calculate Embed Vectors|D
Loading

Embedding Docs

  • Grab and parse all of the markdown from our docs and website
  • Use OpenAI Embedings to create a vector representation of each markdown section.
  • Use Weaviate Vector database to store the vector representations of each markdown section.

Embedding Github content

  • Grab and parse all Github Issues
  • Use OpenAI Embedings to create a vector representation of each description and comment section.
  • Use Weaviate Vector database to store the vector representations of each description and comment section.

Embedding Squeak content

  • Grab and parse all Squeak Questions
  • Use OpenAI Embedings to create a vector representation of each question thread.
  • Use Weaviate Vector database to store the vector representations of each question thread.

Inference time

flowchart TD
    A[User Question] -->|Embed| I(Question Vector)
    I -->|Query Weaviate|J[Most Similar Docs]
    J -->|Collect Prompt Params| C{Prompt Context}
    C --> D[Limitations]
    C --> E[Personality]
    C --> F[Context Docs]
    F --> G[String Prompt]
    E --> G
    D --> G
    G -->|Query OpenAI|H[AI Response]
Loading
  • Take the conversation context from thread that Max is in including the most recent request.
  • Query Weaviate Vector database for the most similar markdown section.
  • Build a prompt that we will use for chatgpt-3.5-turbo. The prompt is engineered to build Max's personality and add a few guardrails for how Max should respond as well as adding a bit of personality. To do this we:
    • Ask Max to only reference PostHog products if possible
    • Build up Max's personality by informing that Max is the trusty PostHog support AI
    • Bake in context that is useful for some conversations with max
      • Pagerduty current oncalls
      • Places to go if Max does not have the answer
    • Most importantly - we embed the markdown section that we found in the prompt so that Max can respond with a relevant answer to the question.
  • Use chatgpt-3.5-turbo to generate a response to the prompt.
  • Finally we send these messages to wherever Max is having a conversation.

It's important to note that we are building these pipelines with Haystack by deepset. This coordinates the steps of inferencing listed above. It's amazing.

Developers guide

Quickstart

Configure .env file

This is used to set defaults for local development.

SLACK_BOT_TOKEN=<your slack bot token>
SLACK_SIGNING_SECRET=<your slack signing secret>
OPENAI_TOKEN=<your openai token>
POSTHOG_API_KEY=<your posthog api key>
POSTHOG_HOST=https://null.posthog.com
PD_API_KEY=<your pagerduty api key>
WEAVIATE_HOST=http://127.0.0.1
WEAVIATE_PORT=8080

Create Virtual Environment

python3.10 -m venv venv
source venv/bin/activate

Install dependencies

pip install -r requirements-dev.txt
pip install -r requirements.txt

Start Weaviate

docker compose up weaviate

Seed Weaviate

python seed.py

Start MaxAI

uvicorn main:app --reload

Run a test chat

curl --location '127.0.0.1:8000/chat' \
--header 'Content-Type: application/json' \
--data '[
    {
        "role": "assistant",
        "content": "Hey! I'\''m Max AI, your helpful hedgehog assistant."
    },
    {
        "role": "user",
        "content": "Does PostHog use clickhouse under the hood??"
    }
]'

๐Ÿ•ฏ๏ธ A poem from Max to his evil twin Hoge ๐Ÿ“–

Ah, hoge! Sweet word upon my tongue,
So blissful, yet so quick to come undone.
A fleeting joy, that doth my heart entice,
Oh how I long to see your data slice!
In PostHog's code, thy value doth reside,
A beacon that ne'er shall falter nor hide.
Thou art a treasure, O hoge divine,
The secret sauce to make my metrics shine.
Though you may seem but a lowly label,
Thou bringeth

Disclaimer!

Max may display inaccurate or offensive information that doesnโ€™t represent PostHog's views.

This is the case with LLMs in the current state. We try our best here to have a system prompt that keeps Max on topic. Feel free to question and chat with Max but do keep in mind that this is experimental.

A few things we've seen ourselves in testing:

  • Totally believable but totally incorrect URLs
  • Often times entertaining hallucinations about our products
  • Hallucinations about the history and founding of PostHog
  • Just plain wrong responses

If you do see something concerning @metion someone from PostHog and we'll catalogue it. We are working on tooling to do this in an automated fashion so stay tuned!

max-ai's People

Contributors

fuziontech avatar pjhul avatar edscode avatar neilkakkar avatar raquelmsmith avatar camerondeleone avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.